514 research outputs found

    AFQN: approximate Qn estimation in data streams

    Get PDF
    We present afqn (Approximate Fast Qn), a novel algorithm for approximate computation of the Qn scale estimator in a streaming setting, in the sliding window model. It is well-known that computing the Qn estimator exactly may be too costly for some applications, and the problem is a fortiori exacerbated in the streaming setting, in which the time available to process incoming data stream items is short. In this paper we show how to efficiently and accurately approximate the Qn estimator. As an application, we show the use of afqn for fast detection of outliers in data streams. In particular, the outliers are detected in the sliding window model, with a simple check based on the Qn scale estimator. Extensive experimental results on synthetic and real datasets confirm the validity of our approach by showing up to three times faster updates per second. Our contributions are the following ones: (i) to the best of our knowledge, we present the first approximation algorithm for online computation of the Qn scale estimator in a streaming setting and in the sliding window model; (ii) we show how to take advantage of our UDDSketch algorithm for quantile estimation in order to quickly compute the Qn scale estimator; (iii) as an example of a possible application of the Qn scale estimator, we discuss how to detect outliers in an input data stream

    Fast online computation of the Qn estimator with applications to the detection of outliers in data streams

    Get PDF
    We present FQN (Fast Qn), a novel algorithm for online computation of the Qn scale estimator. The algorithm works in the sliding window model, cleverly computing the Qn scale estimator in the current window. We thoroughly compare our algorithm for online Qn with the state of the art competing algorithm by Nunkesser et al., and show that FQN (i) is faster, requiring only O(s) time in the worst case where s is the length of the window (ii) its computational complexity does not depend on the input distribution and (iii) it requires less space. To the best of our knowledge, our algorithm is the first that allows online computation of the Qn scale estimator in worst case time linear in the size of the window. As an example of a possible application, besides its use as a robust measure of statistical dispersion, we show how to use the Qn estimator for fast detection of outliers in data streams. Extensive experimental results on both synthetic and real datasets confirm the validity of our approach

    A decision making procedure for robust train rescheduling based on mixed integer linear programming and Data Envelopment Analysis

    Get PDF
    This paper presents a self-learning decision making procedure for robust real-time train rescheduling in case of disturbances. The procedure is applicable to aperiodic timetables of mixed-tracked networks and it consists of three steps. The first two are executed in real-time and provide the rescheduled timetable, while the third one is executed offline and guarantees the self-learning part of the method. In particular, in the first step, a robust timetable is determined, which is valid for a finite time horizon. This robust timetable is obtained solving a mixed integer linear programming problem aimed at finding the optimal compromise between two objectives: the minimization of the delays of the trains and the maximization of the robustness of the timetable. In the second step, a merging procedure is first used to join the obtained timetable with the nominal one. Then, a heuristics is applied to identify and solve all conflicts eventually arising after the merging procedure. Finally, in the third step an offline cross-efficiency fuzzy Data Envelopment Analysis technique is applied to evaluate the efficiency of the rescheduled timetable in terms of delays minimization and robustness maximization when different relevance weights (defining the compromise between the two optimization objectives) are used in the first step. The procedure is thus able to determine appropriate relevance weights to employ when disturbances of the same type affect again the network. The railway service provider can take advantage of this procedure to automate, optimize, and expedite the rescheduling process. Moreover, thanks to the self-learning capability of the procedure, the quality of the rescheduling is improved at each reapplication of the method. The technique is applied to a real data set related to a regional railway network in Southern Italy to test its effectiveness

    Just in Time Transformers

    Get PDF
    Precise energy load forecasting in residential households is crucial for mitigating carbon emissions and enhancing energy efficiency; indeed, accurate forecasting enables utility companies and policymakers, who advocate sustainable energy practices, to optimize resource utilization. Moreover, smart meters provide valuable information by allowing for granular insights into consumption patterns. Building upon available smart meter data, our study aims to cluster consumers into distinct groups according to their energy usage behaviours, effectively capturing a diverse spectrum of consumption patterns. Next, we design JITtrans (Just In Time transformer), a novel transformer deep learning model that significantly improves energy consumption forecasting accuracy, with respect to traditional forecasting methods. Extensive experimental results validate our claims using proprietary smart meter data. Our findings highlight the potential of advanced predictive technologies to revolutionize energy management and advance sustainable power systems: the development of efficient and eco-friendly energy solutions critically depends on such technologies

    High Throughput Protein Similarity Searches in the LIBI Grid Problem Solving Environment

    Get PDF
    Bioinformatics applications are naturally distributed, due to distribution of involved data sets, experimental data and biological databases. They require high computing power, owing to the large size of data sets and the complexity of basic computations, may access heterogeneous data, where heterogeneity is in data format, access policy, distribution, etc., and require a secure infrastructure, because they could access private data owned by different organizations. The Problem Solving Environment (PSE) is an approach and a technology that can fulfil such bioinformatics requirements. The PSE can be used for the definition and composition of complex applications, hiding programming and configuration details to the user that can concentrate only on the specific problem. Moreover, Grids can be used for building geographically distributed collaborative problem solving environments and Grid aware PSEs can search and use dispersed high performance computing, networking, and data resources. In this work, the PSE solution has been chosen as the integration platform of bioinformatics tools and data sources. In particular an experiment of multiple sequence alignment on large scale, supported by the LIBIPSE, is presented

    Data envelopment analysis in financial services: a citations network analysis of banks, insurance companies and money market funds

    Get PDF
    Development and application of the data envelopment analysis (DEA) method, have been the subject of numerous reviews. In this paper, we consider the papers that apply DEA methods specifically to financial services, or which use financial services data to experiment with a newly introduced DEA model. We examine 620 papers published in journals indexed in the Web of Science database, from 1985 to April 2016. We analyse the sample applying citations network analysis. This paper investigates the DEA method and its applications in financial services. We analyse the diffusion of DEA in three sub-samples: (1) banking groups, (2) money market funds, and (3) insurance groups by identifying the main paths, that is, the main flows of the ideas underlying each area of research. This allows us to highlight the main approaches, models and efficiency types used in each research areas. No unique methodological preference emerges within these areas. Innovations in the DEA methodologies (network models, slacks based models, directional distance models and Nash bargaining game) clearly dominate recent research. For each subsample, we describe the geographical distribution of these studies, and provide some basic statistics related to the most active journals and scholars

    The PURPLE mystery: Semantic meaning of three purple terms in French speakers from Algeria, France, and Switzerland

    Get PDF
    Studies on the colour category PURPLE yielded inconsistent category boundaries, focal colours, and colour-emotion associations. In French, there are at least three colour terms referring to the shades of purple, potentially weighing on these inconsistencies. Thus, we tested the semantic breadth and richness in semantic meaning of violet (basic term), lilas (non-basic), and pourpre (non-basic). We collected free associations in 274 French speakers from Algeria, France, and Switzerland, yielding 2,079 responses, of which 436 were discrete and 275 were unique. Frequency analyses and semantic coding supported the basicness status of violet in French, within a hierarchically structured semantic system. Moreover, the meaning of the three terms was not synonymous. Violet had the most abstract meaning. Lilas had the narrowest meaning, mainly referring to Natural Entities. Pourpre seemed close to RED. We found no differences between the countries. Future studies should extend this approach to other languages and other colour terms

    SARS-CoV-2 infection among hospitalised pregnant women and impact of different viral strains on COVID-19 severity in Italy: a national prospective population-based cohort study

    Get PDF
    OBJECTIVE: The primary aim of this article was to describe SARS-CoV-2 infection among pregnant women during the wild-type and Alpha-variant periods in Italy. The secondary aim was to compare the impact of the virus variants on the severity of maternal and perinatal outcomes. DESIGN: National population-based prospective cohort study. SETTING: A total of 315 Italian maternity hospitals. SAMPLE: A cohort of 3306 women with SARS-CoV-2 infection confirmed within 7 days of hospital admission. METHODS: Cases were prospectively reported by trained clinicians for each participating maternity unit. Data were described by univariate and multivariate analyses. MAIN OUTCOME MEASURES: COVID-19 pneumonia, ventilatory support, intensive care unit (ICU) admission, mode of delivery, preterm birth, stillbirth, and maternal and neonatal mortality. RESULTS: We found that 64.3% of the cohort was asymptomatic, 12.8% developed COVID-19 pneumonia and 3.3% required ventilatory support and/or ICU admission. Maternal age of 30-34 years (OR 1.43, 95% CI 1.09-1.87) and ≥35 years (OR 1.62, 95% CI 1.23-2.13), citizenship of countries with high migration pressure (OR 1.75, 95% CI 1.36-2.25), previous comorbidities (OR 1.49, 95% CI 1.13-1.98) and obesity (OR 1.72, 95% CI 1.29-2.27) were all associated with a higher occurrence of pneumonia. The preterm birth rate was 11.1%. In comparison with the pre-pandemic period, stillbirths and maternal and neonatal deaths remained stable. The need for ventilatory support and/or ICU admission among women with pneumonia increased during the Alpha-variant period compared with the wild-type period (OR 3.24, 95% CI 1.99-5.28). CONCLUSIONS: Our results are consistent with a low risk of severe COVID-19 disease among pregnant women and with rare adverse perinatal outcomes. During the Alpha-variant period there was a significant increase of severe COVID-19 illness. Further research is needed to describe the impact of different SARS-CoV-2 viral strains on maternal and perinatal outcomes

    A comparative analysis of colour–emotion associations in 16–88‐year‐old adults from 31 countries

    Get PDF
    As people age, they tend to spend more time indoors, and the colours in their surroundings may significantly impact their mood and overall well-being. However, there is a lack of empirical evidence to provide informed guidance on colour choices, irrespective of age group. To work towards informed choices, we investigated whether the associations between colours and emotions observed in younger individuals also apply to older adults. We recruited 7,393 participants, aged between 16 and 88 years and coming from 31 countries. Each participant associated 12 colour terms with 20 emotion concepts and rated the intensity of each associated emotion. Different age groups exhibited highly similar patterns of colour-emotion associations (average similarity coefficient of 0.97), with subtle yet meaningful age-related differences. Adolescents associated the greatest number but the least positively biased emotions with colours. Older participants associated a smaller number but more intense and more positive emotions with all colour terms, displaying a positivity effect. Age also predicted arousal and power biases, varying by colour. Findings suggest parallels in colour-emotion associations between younger and older adults, with subtle but significant age-related variations. Future studies should next assess whether colour-emotion associations reflect what people actually feel when exposed to colour

    The ancestral flower of angiosperms and its early diversification.

    Get PDF
    Recent advances in molecular phylogenetics and a series of important palaeobotanical discoveries have revolutionized our understanding of angiosperm diversification. Yet, the origin and early evolution of their most characteristic feature, the flower, remains poorly understood. In particular, the structure of the ancestral flower of all living angiosperms is still uncertain. Here we report model-based reconstructions for ancestral flowers at the deepest nodes in the phylogeny of angiosperms, using the largest data set of floral traits ever assembled. We reconstruct the ancestral angiosperm flower as bisexual and radially symmetric, with more than two whorls of three separate perianth organs each (undifferentiated tepals), more than two whorls of three separate stamens each, and more than five spirally arranged separate carpels. Although uncertainty remains for some of the characters, our reconstruction allows us to propose a new plausible scenario for the early diversification of flowers, leading to new testable hypotheses for future research on angiosperms
    corecore