1,884 research outputs found
Evaluation of DVFS techniques on modern HPC processors and accelerators for energy-aware applications
Energy efficiency is becoming increasingly important for computing systems,
in particular for large scale HPC facilities. In this work we evaluate, from an
user perspective, the use of Dynamic Voltage and Frequency Scaling (DVFS)
techniques, assisted by the power and energy monitoring capabilities of modern
processors in order to tune applications for energy efficiency. We run selected
kernels and a full HPC application on two high-end processors widely used in
the HPC context, namely an NVIDIA K80 GPU and an Intel Haswell CPU. We evaluate
the available trade-offs between energy-to-solution and time-to-solution,
attempting a function-by-function frequency tuning. We finally estimate the
benefits obtainable running the full code on a HPC multi-GPU node, with respect
to default clock frequency governors. We instrument our code to accurately
monitor power consumption and execution time without the need of any additional
hardware, and we enable it to change CPUs and GPUs clock frequencies while
running. We analyze our results on the different architectures using a simple
energy-performance model, and derive a number of energy saving strategies which
can be easily adopted on recent high-end HPC systems for generic applications
Significant Enhancement of Neutralino Dark Matter Annihilation from Electroweak Bremsstrahlung
Indirect searches for the cosmological dark matter have become ever more
competitive during the past years. Here, we report the first full calculation
of leading electroweak corrections to the annihilation rate of supersymmetric
neutralino dark matter. We find that these corrections can be huge, partially
due to contributions that have been overlooked so far. Our results imply a
significantly enhanced discovery potential of this well motivated dark matter
candidate with current and upcoming cosmic ray experiments, in particular for
gamma rays and models with somewhat small annihilation rates at tree level.Comment: 7 pages revtex4; 4 figures. Minor changes to match published versio
Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version
Multi-Node Advanced Performance and Power Analysis with Paraver
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a preliminary performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. Moreover we show how the same analysis techniques are applicable on different architectures, analyzing the same HPC application running on two clusters, based respectively on Intel Haswell and Arm Cortex-A57 CPUs.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the
Mont-Blanc projects, grant agreements n. 288777, 610402 and 671697. E.C. was
partially founded by “Contributo 5 per mille assegnato all’Universit`a degli Studi di Ferrara
- dichiarazione dei redditi dell’anno 2014”.Peer ReviewedPostprint (author's final draft
Agostino e la teoria della “guerra giusta” (a proposito di Qu. 6.10).
Si è cercato di comprendere l’oscillazione, che nel corso di secoli ha caratterizzato il contenuto dell’espressione “guerra giusta” (“bellum iustum” nella lingua latina), tra il significato giuridico per cui ‘giusto’ è da intendere ‘secondo il diritto’ (es. Carta dell’ONU artt. 2 e 51), e il significato etico per cui la ‘guerra giusta’ è la ‘guerra contro il male’ (es. la “just war” statunitense in Afghanistan). La tesi sostenuta è che il punto di inizio di tale oscillazione si debba individuare nel pensiero di Sant’Agostino che, posizionando Dio come vertice assoluto della costruzione teologica, riconduce la guerra, in particolare quella ‘giusta’, alla volontà dello stesso (“Deus imperare bellum”). Gli scritti di Agostino provano questo assunto, come ad esempio il testo delle Questiones 6,10 analizzato nel presente lavoro, in cui il Vescovo di Ippona rimodula il concetto tradizionale di “bellum iustum”
Energy-efficiency evaluation of Intel KNL for HPC workloads
Energy consumption is increasingly becoming a limiting factor to the design
of faster large-scale parallel systems, and development of energy-efficient and
energy-aware applications is today a relevant issue for HPC code-developer
communities. In this work we focus on energy performance of the Knights Landing
(KNL) Xeon Phi, the latest many-core architecture processor introduced by Intel
into the HPC market. We take into account the 64-core Xeon Phi 7230, and
analyze its energy performance using both the on-chip MCDRAM and the regular
DDR4 system memory as main storage for the application data-domain. As a
benchmark application we use a Lattice Boltzmann code heavily optimized for
this architecture and implemented using different memory data layouts to store
its lattice. We assessthen the energy consumption using different memory
data-layouts, kind of memory (DDR4 or MCDRAM) and number of threads per core
The GeV Excess Shining Through: Background Systematics for the Inner Galaxy Analysis
Recently, a spatially extended excess of gamma rays collected by the
Fermi-LAT from the inner region of the Milky Way has been detected by different
groups and with increasingly sophisticated techniques. Yet, any final
conclusion about the morphology and spectral properties of such an extended
diffuse emission are subject to a number of potentially critical uncertainties,
related to the high density of cosmic rays, gas, magnetic fields and abundance
of point sources. We will present a thorough study of the systematic
uncertainties related to the modelling of diffuse background and to the
propagation of cosmic rays in the inner part of our Galaxy. We will test a
large set of models for the Galactic diffuse emission, generated by varying the
propagation parameters within extreme conditions. By using those models in the
analysis of Fermi-LAT data as Galactic foreground, we will show that the
gamma-ray excess survives and we will quantify the uncertainties affecting the
excess morphology and energy spectrum.Comment: 2014 Fermi Symposium proceedings - eConf C14102.1 7 pages, 4 figure
- …
