1,812 research outputs found
Controlling Network Latency in Mixed Hadoop Clusters: Do We Need Active Queue Management?
With the advent of big data, data center applications are processing vast amounts of unstructured and semi-structured data, in parallel on large clusters, across hundreds to thousands of nodes. The highest performance for these batch big data workloads is achieved using expensive network equipment with large buffers, which accommodate bursts in network traffic and allocate bandwidth fairly even when the network is congested. Throughput-sensitive big data applications are, however, often executed in the same data center as latency-sensitive workloads. For both workloads to be supported well, the network must provide both maximum throughput and low latency. Progress has been made in this direction, as modern network switches support Active Queue Management (AQM) and Explicit Congestion Notifications (ECN), both mechanisms to control the level of queue occupancy, reducing the total network latency. This paper is the first study of the effect of Active Queue Management on both throughput and latency, in the context of Hadoop and the MapReduce programming model. We give a quantitative comparison of four different approaches for controlling buffer occupancy and latency: RED and CoDel, both standalone and also combined with ECN and DCTCP network protocol, and identify the AQM configurations that maintain Hadoop execution time gains from larger buffers within 5%, while reducing network packet latency caused by bufferbloat by up to 85%. Finally, we provide recommendations to administrators of Hadoop clusters as to how to improve latency without degrading the throughput of batch big data workloads.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Peer ReviewedPostprint (author's final draft
Interconnect Energy Savings and Lower Latency Networks in Hadoop Clusters: The Missing Link
An important challenge of modern data centres running Hadoop workloads is to minimise energy consumption, a significant proportion of which is due to the network. Significant network savings are already possible using Energy Efficient Ethernet, supported by a large number of NICs and switches, but recent work has demonstrated that the packet coalescing settings must be carefully configured to avoid a substantial loss in performance. Meanwhile, Hadoop is evolving from its original batch concept to become a more iterative type of framework. Other recent work attempts to reduce Hadoop's network latency using Explicit Congestion Notifications. Linking these studies reveals that, surprisingly, even when packet coalescing does not hurt performance, it can degrade network latency much more than previously thought. This paper is the first to analyze the impact of packet coalescing in the context of network latency. We investigate how to design and configure interconnects to provide the maximum energy savings without degrading cluster throughput performance or network latency.The research leading to these results has received funding from the European Unions Seventh Framework Programme
(FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish
Government.Peer ReviewedPostprint (author's final draft
Exploring interconnect energy savings under East-West traffic pattern of MapReduce clusters
An important challenge of modern data centers is to reduce energy consumption, of which a substantial proportion is due to the network. Energy Efficient Ethernet (EEE) is a recent standard that aims to reduce network power consumption, but current practice is to disable it in production use, since it has a poorly understood impact on real world application performance. An important application framework commonly used in modern data centers is Apache Hadoop, which implements the MapReduce programming model. This paper is the first to analyse the impact of EEE on MapReduce workloads, in terms of performance overheads and energy savings. We find that optimum energy savings are possible if the links use
packet coalescing. Packet coalescing must, however, be carefully configured in order to avoid excessive performance degradation.The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contract TIN2012-34557, HiPEAC-3 Network of Excellence (ICT-287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Postprint (author's final draft
High Throughput and Low Latency on Hadoop Clusters Using Explicit Congestion Notification: The Untold Truth
Various extensions of TCP/IP have been proposed to reduce network latency; examples include Explicit Congestion Notification (ECN), Data Center TCP (DCTCP) and several proposals for Active Queue Management (AQM). Combining these techniques requires adjusting various parameters, and recent studies have found that it is difficult to do so while obtaining both high performance and low latency. This is especially true for mixed use data centres that host both latency-sensitive applications and high-throughput workloads such as Hadoop.This paper studies the difficulty in configuration, and characterises the problem as related to ACK packets. Such packets cannot be set as ECN Capable Transport (ECT), with the consequence that a disproportionate number of them are dropped. We explain how this behavior decreases throughput, and propose a small change to the way that non-ECT-capable packets are handled in the network switches. We demonstrate robust performance for modified AQMs on a Hadoop cluster, maintaining full throughput while reducing latency by 85%. We also demonstrate that commodity switches with shallow buffers are able to reach the same throughput as deeper buffer switches. Finally, we explain how both TCP-ECN and DCTCP can achieve the best performance using a simple marking scheme, in constrast to the current preference for relying on AQMs to mark packets.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish
Government.Peer ReviewedPostprint (author's final draft
Energy Efficient Ethernet on MapReduce Clusters: Packet Coalescing To Improve 10GbE Links
An important challenge of modern data centers is to reduce energy consumption, of which a substantial proportion is due to the network. Switches and NICs supporting the recent energy efficient Ethernet (EEE) standard are now available, but current practice is to disable EEE in production use, since its effect on real world application performance is poorly understood. This paper contributes to this discussion by analyzing the impact of EEE on MapReduce workloads, in terms of performance overheads and energy savings. MapReduce is the central programming model of Apache Hadoop, one of the most widely used application frameworks in modern data centers. We find that, while 1GbE links (edge links) achieve good energy savings using the standard EEE implementation, optimum energy savings in the 10 GbE links (aggregation and core links) are only possible, if these links employ packet coalescing. Packet coalescing must, however, be carefully configured in order to avoid excessive performance degradation. With our new analysis of how the static parameters of packet coalescing perform under different cluster loads, we were able to cover both idle and heavy load periods that can exist on this type of environment. Finally, we evaluate our recommendation for packet coalescing for 10 GbE links using the energy-delay metric. This paper is an extension of our previous work [1], which was published in the Proceedings of the 40th Annual IEEE Conference on Local Computer Networks (LCN 2015).This work was supported in part by the
European Union’s Seventh Framework Programme (FP7/2007-2013) under Grant 610456 (EUROSERVER), in part by the Spanish Government through the Severo Ochoa programme (SEV-2011-00067 and SEV-2015-0493), in part by the Spanish Ministry of Economy a nd Competitiveness under Contract TIN2012-34557 and Contract TIN2015-65316-P, and in part by the Generalitat de Catalunya under Contract 2014-SGR-1051 and Contract 2014-SGR-1272.Peer ReviewedPostprint (author's final draft
Wigner phase space distribution as a wave function
We demonstrate that the Wigner function of a pure quantum state is a wave
function in a specially tuned Dirac bra-ket formalism and argue that the Wigner
function is in fact a probability amplitude for the quantum particle to be at a
certain point of the classical phase space. Additionally, we establish that in
the classical limit, the Wigner function transforms into a classical
Koopman-von Neumann wave function rather than into a classical probability
distribution. Since probability amplitude need not be positive, our findings
provide an alternative outlook on the Wigner function's negativity.Comment: 6 pages and 2 figure
Neuronal assembly dynamics in supervised and unsupervised learning scenarios
The dynamic formation of groups of neurons—neuronal assemblies—is believed to mediate cognitive phenomena at many levels, but their detailed operation and mechanisms of interaction are still to be uncovered. One hypothesis suggests that synchronized oscillations underpin their formation and functioning, with a focus on the temporal structure of neuronal signals. In this context, we investigate neuronal assembly dynamics in two complementary scenarios: the first, a supervised spike pattern classification task, in which noisy variations of a collection of spikes have to be correctly labeled; the second, an unsupervised, minimally cognitive evolutionary robotics tasks, in which an evolved agent has to cope with multiple, possibly conflicting, objectives. In both cases, the more traditional dynamical analysis of the system’s variables is paired with information-theoretic techniques in order to get a broader picture of the ongoing interactions with and within the network. The neural network model is inspired by the Kuramoto model of coupled phase oscillators and allows one to fine-tune the network synchronization dynamics and assembly configuration. The experiments explore the computational power, redundancy, and generalization capability of neuronal circuits, demonstrating that performance depends nonlinearly on the number of assemblies and neurons in the network and showing that the framework can be exploited to generate minimally cognitive behaviors, with dynamic assembly formation accounting for varying degrees of stimuli modulation of the sensorimotor interactions
Conceptual inconsistencies in finite-dimensional quantum and classical mechanics
Utilizing operational dynamic modeling [Phys. Rev. Lett. 109, 190403 (2012);
arXiv:1105.4014], we demonstrate that any finite-dimensional representation of
quantum and classical dynamics violates the Ehrenfest theorems. Other
peculiarities are also revealed, including the nonexistence of the free
particle and ambiguity in defining potential forces. Non-Hermitian mechanics is
shown to have the same problems. This work compromises a popular belief that
finite-dimensional mechanics is a straightforward discretization of the
corresponding infinite-dimensional formulation.Comment: 5 pages, 2 figure
Fundamental parameters of Be stars located in the seismology fields of COROT
In preparation for the COROT space mission, we determined the fundamental
parameters (spectral type, temperature, gravity, vsini) of the Be stars
observable by COROT in its seismology fields (64 Be stars). We applied a
careful and detailed modeling of the stellar spectra, taking into account the
veiling caused by the envelope, as well as the gravitational darkening and
stellar flattening due to rapid rotation. Evolutionary tracks for fast rotators
were used to derive stellar masses and ages. The derived parameters will be
used to select Be stars as secondary targets (i.e. observed for 5 consecutive
months) and short-run targets of the COROT mission. Furthermore, we note that
the main part of our stellar sample is falling in the second half of the main
sequence life time, and that in most cases the luminosity class of Be stars is
inaccurate in characterizing their evolutionary status.Comment: 25 pages, 9 figures, Accepted for publication in A&
- …
