337 research outputs found
Fast online computation of the Qn estimator with applications to the detection of outliers in data streams
We present FQN (Fast Qn), a novel algorithm for online computation of the Qn scale estimator. The algorithm works in the sliding window model, cleverly computing the Qn scale estimator in the current window. We thoroughly compare our algorithm for online Qn with the state of the art competing algorithm by Nunkesser et al., and show that FQN (i) is faster, requiring only O(s) time in the worst case where s is the length of the window (ii) its computational complexity does not depend on the input distribution and (iii) it requires less space. To the best of our knowledge, our algorithm is the first that allows online computation of the Qn scale estimator in worst case time linear in the size of the window. As an example of a possible application, besides its use as a robust measure of statistical dispersion, we show how to use the Qn estimator for fast detection of outliers in data streams. Extensive experimental results on both synthetic and real datasets confirm the validity of our approach
AFQN: approximate Qn estimation in data streams
We present afqn (Approximate Fast Qn), a novel algorithm for approximate computation of the Qn scale estimator in a streaming setting, in the sliding window model. It is well-known that computing the Qn estimator exactly may be too costly for some applications, and the problem is a fortiori exacerbated in the streaming setting, in which the time available to process incoming data stream items is short. In this paper we show how to efficiently and accurately approximate the Qn estimator. As an application, we show the use of afqn for fast detection of outliers in data streams. In particular, the outliers are detected in the sliding window model, with a simple check based on the Qn scale estimator. Extensive experimental results on synthetic and real datasets confirm the validity of our approach by showing up to three times faster updates per second. Our contributions are the following ones: (i) to the best of our knowledge, we present the first approximation algorithm for online computation of the Qn scale estimator in a streaming setting and in the sliding window model; (ii) we show how to take advantage of our UDDSketch algorithm for quantile estimation in order to quickly compute the Qn scale estimator; (iii) as an example of a possible application of the Qn scale estimator, we discuss how to detect outliers in an input data stream
Just in Time Transformers
Precise energy load forecasting in residential households is crucial for mitigating carbon emissions and enhancing energy efficiency; indeed, accurate forecasting enables utility companies and policymakers, who advocate sustainable energy practices, to optimize resource utilization. Moreover, smart meters provide valuable information by allowing for granular insights into consumption patterns. Building upon available smart meter data, our study aims to cluster consumers into distinct groups according to their energy usage behaviours, effectively capturing a diverse spectrum of consumption patterns. Next, we design JITtrans (Just In Time transformer), a novel transformer deep learning model that significantly improves energy consumption forecasting accuracy, with respect to traditional forecasting methods. Extensive experimental results validate our claims using proprietary smart meter data. Our findings highlight the potential of advanced predictive technologies to revolutionize energy management and advance sustainable power systems: the development of efficient and eco-friendly energy solutions critically depends on such technologies
High Throughput Protein Similarity Searches in the LIBI Grid Problem Solving Environment
Bioinformatics applications are naturally distributed, due to distribution of involved data sets, experimental data and biological databases. They require high computing power, owing to the large size of data sets and the complexity of basic computations, may access heterogeneous data, where heterogeneity is in data format, access policy, distribution, etc., and require a secure infrastructure, because they could access private data owned by different organizations. The Problem Solving Environment (PSE) is an approach and a technology that can fulfil such bioinformatics requirements. The PSE can be used for the definition and composition of complex applications, hiding programming and configuration details to the user that can concentrate only on the specific problem. Moreover, Grids can be used for building geographically distributed collaborative problem solving environments and Grid aware PSEs can search and use dispersed high performance computing, networking, and data resources. In this work, the PSE solution has been chosen as the integration platform of bioinformatics tools and data sources. In particular an experiment of multiple sequence alignment on large scale, supported by the LIBIPSE, is presented
Parallel implementation of the SHYFEM (System of HydrodYnamic Finite Element Modules) model
This paper presents the message passing interface (MPI)-based parallelization of the three-dimensional hydrodynamic model SHYFEM (System of HydrodYnamic Finite Element Modules). The original sequential version of the code was parallelized in order to reduce the execution time of high-resolution configurations using state-of-the-art high-performance computing (HPC) systems. A distributed memory approach was used, based on the MPI. Optimized numerical libraries were used to partition the unstructured grid (with a focus on load balancing) and to solve the sparse linear system of equations in parallel in the case of semi-to-fully implicit time stepping. The parallel implementation of the model was validated by comparing the outputs with those obtained from the sequential version. The performance assessment demonstrates a good level of scalability with a realistic configuration used as benchmark
SARS-CoV-2 infection among hospitalised pregnant women and impact of different viral strains on COVID-19 severity in Italy: a national prospective population-based cohort study
OBJECTIVE: The primary aim of this article was to describe SARS-CoV-2 infection among pregnant women during the wild-type and Alpha-variant periods in Italy. The secondary aim was to compare the impact of the virus variants on the severity of maternal and perinatal outcomes. DESIGN: National population-based prospective cohort study. SETTING: A total of 315 Italian maternity hospitals. SAMPLE: A cohort of 3306 women with SARS-CoV-2 infection confirmed within 7 days of hospital admission. METHODS: Cases were prospectively reported by trained clinicians for each participating maternity unit. Data were described by univariate and multivariate analyses. MAIN OUTCOME MEASURES: COVID-19 pneumonia, ventilatory support, intensive care unit (ICU) admission, mode of delivery, preterm birth, stillbirth, and maternal and neonatal mortality. RESULTS: We found that 64.3% of the cohort was asymptomatic, 12.8% developed COVID-19 pneumonia and 3.3% required ventilatory support and/or ICU admission. Maternal age of 30-34 years (OR 1.43, 95% CI 1.09-1.87) and ≥35 years (OR 1.62, 95% CI 1.23-2.13), citizenship of countries with high migration pressure (OR 1.75, 95% CI 1.36-2.25), previous comorbidities (OR 1.49, 95% CI 1.13-1.98) and obesity (OR 1.72, 95% CI 1.29-2.27) were all associated with a higher occurrence of pneumonia. The preterm birth rate was 11.1%. In comparison with the pre-pandemic period, stillbirths and maternal and neonatal deaths remained stable. The need for ventilatory support and/or ICU admission among women with pneumonia increased during the Alpha-variant period compared with the wild-type period (OR 3.24, 95% CI 1.99-5.28). CONCLUSIONS: Our results are consistent with a low risk of severe COVID-19 disease among pregnant women and with rare adverse perinatal outcomes. During the Alpha-variant period there was a significant increase of severe COVID-19 illness. Further research is needed to describe the impact of different SARS-CoV-2 viral strains on maternal and perinatal outcomes
Recommended from our members
The computational and energy cost of simulation and storage for climate science: lessons from CMIP6
The Coupled Model Intercomparison Project (CMIP) is one of the biggest international efforts aimed at better understanding the past, present, and future of climate changes in a multi-model context. A total of 21 model intercomparison projects (MIPs) were endorsed in its sixth phase (CMIP6), which included 190 different experiments that were used to simulate 40 000 years and produced around 40 PB of data in total. This paper presents the main findings obtained from the CPMIP (the Computational Performance Model Intercomparison Project), a collection of a common set of metrics, specifically designed for assessing climate model performance. These metrics were exclusively collected from the production runs of experiments used in CMIP6 and primarily from institutions within the IS-ENES3 consortium. The document presents the full set of CPMIP metrics per institution and experiment, including a detailed analysis and discussion of each of the measurements. During the analysis, we found a positive correlation between the core hours needed, the complexity of the models, and the resolution used. Likewise, we show that between 5 %–15 % of the execution cost is spent in the coupling between independent components, and it only gets worse by increasing the number of resources. From the data, it is clear that queue times have a great impact on the actual speed achieved and have a huge variability across different institutions, ranging from none to up to 78 % execution overhead. Furthermore, our evaluation shows that the estimated carbon footprint of running such big simulations within the IS-ENES3 consortium is 1692 t of CO2 equivalent.
As a result of the collection, we contribute to the creation of a comprehensive database for future community reference, establishing a benchmark for evaluation and facilitating the multi-model, multi-platform comparisons crucial for understanding climate modelling performance. Given the diverse range of applications, configurations, and hardware utilised, further work is required for the standardisation and formulation of general rules. The paper concludes with recommendations for future exercises aimed at addressing the encountered challenges which will facilitate more collections of a similar nature
The UHECR dipole and quadrupole in the latest data from the original Auger and TA surface detectors
The sources of ultra-high-energy cosmic rays are still unknown, but assuming standard physics, they are expected to lie within a few hundred megaparsecs from us. Indeed, over cosmological distances cosmic rays lose energy to interactions with background photons, at a rate depending on their mass number and energy and properties of photonuclear interactions and photon backgrounds. The universe is not homogeneous at such scales, hence the distribution of the arrival directions of cosmic rays is expected to reflect the inhomogeneities in the distribution of galaxies; the shorter the energy loss lengths, the stronger the expected anisotropies. Galactic and intergalactic magnetic fields can blur and distort the picture, but the magnitudes of the largest-scale anisotropies, namely the dipole and quadrupole moments, are the most robust to their effects. Measuring them with no bias regardless of any higher-order multipoles is not possible except with full-sky coverage. In this work, we achieve this in three energy ranges (approximately 8--16 EeV, 16--32 EeV, and 32--∞ EeV) by combining surface-detector data collected at the Pierre Auger Observatory until 2020 and at the Telescope Array (TA) until 2019, before the completion of the upgrades of the arrays with new scintillator detectors. We find that the full-sky coverage achieved by combining Auger and TA data reduces the uncertainties on the north-south components of the dipole and quadrupole in half compared to Auger-only results
The UHECR dipole and quadrupole in the latest data from the original Auger and TA surface detectors
The sources of ultra-high-energy cosmic rays are still unknown, but assuming standard physics, they are expected to lie within a few hundred megaparsecs from us. Indeed, over cosmological distances cosmic rays lose energy to interactions with background photons, at a rate depending on their mass number and energy and properties of photonuclear interactions and photon backgrounds. The universe is not homogeneous at such scales, hence the distribution of the arrival directions of cosmic rays is expected to reflect the inhomogeneities in the distribution of galaxies; the shorter the energy loss lengths, the stronger the expected anisotropies. Galactic and intergalactic magnetic fields can blur and distort the picture, but the magnitudes of the largest-scale anisotropies, namely the dipole and quadrupole moments, are the most robust to their effects. Measuring them with no bias regardless of any higher-order multipoles is not possible except with full-sky coverage. In this work, we achieve this in three energy ranges (approximately 8–16 EeV, 16–32 EeV, and 32–∞ EeV) by combining surface-detector data collected at the Pierre Auger Observatory until 2020 and at the Telescope Array (TA) until 2019, before the completion of the upgrades of the arrays with new scintillator detectors. We find that the full-sky coverage achieved by combining Auger and TA data reduces the uncertainties on the north-south components of the dipole and quadrupole in half compared to Auger-only results
Searches for Ultra-High-Energy Photons at the Pierre Auger Observatory
The Pierre Auger Observatory, being the largest air-shower experiment in the
world, offers an unprecedented exposure to neutral particles at the highest
energies. Since the start of data taking more than 18 years ago, various
searches for ultra-high-energy (UHE, ) photons have
been performed: either for a diffuse flux of UHE photons, for point sources of
UHE photons or for UHE photons associated with transient events like
gravitational wave events. In the present paper, we summarize these searches
and review the current results obtained using the wealth of data collected by
the Pierre Auger Observatory.Comment: Review article accepted for publication in Universe (special issue on
ultra-high energy photons
- …
