366 research outputs found

    Computational algorithms to predict Gene Ontology annotations

    Get PDF
    Background Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. Methods We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. Results We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Conclusions Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a significant number of novel annotations, demonstrating to actually be a helpful tool in supporting scientists in the curation process of gene functional annotations

    Bridging the Flexibility Concepts in the Buildings and Multi-energy Domains

    Get PDF
    paper aims to stimulate a discussion on how to create a bridge between the concept of flexibility used in power and energy systems and the flexibility that buildings can offer for providing services to the electrical system. The paper recalls the main concepts and approaches considered in the power systems and multi-energy systems, and summarises some aspects of flexibility in buildings. The overview shows that there is room to strengthen the contacts among the scientists operating in these fields. The common aim is to identify the complementary aspects and provide inputs to enhance the methodologies and models to enable and support an effective energy and ecologic transition

    The Venus score for the assessment of the quality and trustworthiness of biomedical datasets

    Get PDF
    Biomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets. Although generally useful, however, they are often incomplete and impractical. The guidelines of Datasheets for Datasets, in particular, are too numerous; the requirements of the Kaggle Dataset Usability Score focus on non-scientific requisites (for example, including a cover image); and the European Union Artificial Intelligence Act (EU AI Act) sets forth sparse and general data governance requirements, which we tailored to datasets for biomedical AI. Against this backdrop, we introduce our new Venus score to assess the data quality and trustworthiness of biomedical datasets. Our score ranges from 0 to 10 and consists of ten questions that anyone developing a bioinformatics, medical informatics, or cheminformatics dataset should answer before the release. In this study, we first describe the EU AI Act, Datasheets for Datasets, and the Kaggle Dataset Usability Score, presenting their requirements and their drawbacks. To do so, we reverse-engineer the weights of the influential Kaggle Score for the first time and report them in this study. We distill the most important data governance requirements into ten questions tailored to the biomedical domain, comprising the Venus score. We apply the Venus score to twelve datasets from multiple subdomains, including electronic health records, medical imaging, microarray and bulk RNA-seq gene expression, cheminformatics, physiologic electrogram signals, and medical text. Analyzing the results, we surface fine-grained strengths and weaknesses of popular datasets, as well as aggregate trends. Most notably, we find a widespread tendency to gloss over sources of data inaccuracy and noise, which may hinder the reliable exploitation of data and, consequently, research results. Overall, our results confirm the applicability and utility of the Venus score to assess the trustworthiness of biomedical data

    Thermal energy storage for grid applications: Current status and emerging trends

    Get PDF
    Thermal energy systems (TES) contribute to the on-going process that leads to higher integration among different energy systems, with the aim of reaching a cleaner, more flexible and sustainable use of the energy resources. This paper reviews the current literature that refers to the development and exploitation of TES-based solutions in systems connected to the electrical grid. These solutions facilitate the energy system integration to get additional flexibility for energy management, enable better use of variable renewable energy sources (RES), and contribute to the modernisation of the energy system infrastructures, the enhancement of the grid operation practices that include energy shifting, and the provision of cost-effective grid services. This paper offers a complementary view with respect to other reviews that deal with energy storage technologies, materials for TES applications, TES for buildings, and contributions of electrical energy storage for grid applications. The main aspects addressed are the characteristics, parameters and models of the TES systems, the deployment of TES in systems with variable RES, microgrids, and multi-energy networks, and the emerging trends for TES applications

    Hybrid (Gas and Geothermal) Greenhouse Simulations Aimed at Optimizing Investment and Operative Costs: A Case Study in NW Italy

    Get PDF
    Generally, greenhouses are high energy-consuming, sometimes accounting for 50% of the cost of greenhouse production. Geothermal energy plays a very important role in maintaining the desired temperature and reducing energy consumption. This work deals with a project of a hybrid heating plant (97% geothermal energy and 3% gas-condensing boiler) for the innovative Plant Phenotyping Greenhouse at the University Campus in Grugliasco (few km West of the city of Turin). The aim of the study is to testify to the energy efficiency of this kind of hybrid plant as well as its economic sustainability. Numerical simulations of a GRT were used to calibrate the system and verify that the software reasonably modeled the real case. They helped to correctly size the geothermal plant, also providing data about the thermal energy storage and production during on and off plant cycles. The results show a thermal power of 50.92 kW over 120 days of plant operation, in line with the expected energy needs to meet the base load demand. Long-term results further ensure a negligeable impact on the ground, with a thermal plume between 5 and 10 m from the plant, reducing substantially in a few months after switching off the plant

    Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen

    Get PDF
    The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells

    The Benefits of the Matthews Correlation Coefficient (MCC) Over the Diagnostic Odds Ratio (DOR) in Binary Classification Assessment

    Get PDF
    To assess the quality of a binary classification, researchers often take advantage of a four-entry contingency table called confusion matrix, containing true positives, true negatives, false positives, and false negatives. To recap the four values of a confusion matrix in a unique score, researchers and statisticians have developed several rates and metrics. In the past, several scientific studies already showed why the Matthews correlation coefficient (MCC) is more informative and trustworthy than confusion-entropy error, accuracy, F1 score, bookmaker informedness, markedness, and balanced accuracy. In this study, we compare the MCC with the diagnostic odds ratio (DOR), a statistical rate employed sometimes in biomedical sciences. After examining the properties of the MCC and of the DOR, we describe the relationships between them, by also taking advantage of an innovative geometrical plot called confusion tetrahedron, presented here for the first time. We then report some use cases where the MCC and the DOR produce discordant outcomes, and explain why the Matthews correlation coefficient is more informative and reliable between the two. Our results can have a strong impact in computer science and statistics, because they clearly explain why the trustworthiness of the information provided by the Matthews correlation coefficient is higher than the one generated by the diagnostic odds ratio

    Ingesta de sacarosa durante la preñez y la lactancia: efectos sobre el metabolismo lipídico en la descendencia adulta

    Get PDF
    :La calidad nutricional en etapas tempranas de la vida influye sobre el desarrollo de patologías crónicas del adulto. Objetivo: examinar el efecto de la dieta rica en sacarosa (DRS) durante la preñez+lactancia sobre aspectos del metabolismo lipídico de la descendencia alimentada con dieta control (DC) o DRS desde el post-destete y hasta las 21 semanas de vida (grupos: DC-DC, DC-DRS, DRS-DRS y DRS-DC). Resultados: -En los grupos DC-DRS, DRS-DRS y DRS-DC observamos mayor adiposidad, peso corporal normal y dislipidemia. Esta última resulta de acelerada secreción hepática de VLDL-Tg y mayor contenido de triglicéridos hepáticos asociado a mayor actividad de enzimas lipogénicas hepáticas: acetil-CoA carboxilasa y enzima málica (p<0.05 vs DC-DC). La actividad Glucosa-6-P-deshidrogenasa hepática fue mayor solo en los grupos DC-DRS y DRS-DRS comparado a DRS-DC y DC-DC. Conclusión: la exposición temprana a DRS conlleva en la adultez a cambios desfavorables en el metabolismo lipídico independientemente que la dieta post-lactancia sea DC o DRS.Nutrition quality during the early steps of life has a high influence on the development of chronic adult diseases. The present study examined the effect of a sucrose-rich diet (SRD) fed to dams during pregnancy+lactation on the lipid metabolism of their adult progeny, fed a control diet (CD) or a SRD after weaning to 21 weeks of life (CD-CD, CD-SRD, SRD-SRD, SRD-CD groups). Results: Final body weight was similar between the groups although adiposity and plasma lipids were significantly higher in CD-SRD, SRD-SRD and SRD-CD vs CD-CD. The dyslipidemia was the result of an increased VLDL-Tg secretion rate and elevated liver triglyceride pool. The novo hepatic lipogenic enzymes acetyl-CoA carboxylase and malic enzyme were significantly higher in rats exposed to SRD at any period of life. On the other hand, the hepatic glucose-6-P-dehydrogenase activity was significantly higher (p<0.05) in DC-DRS and DRS-DRS compared to CD-CD and SRD-CD. Conclusion: An early life exposure to a SRD is associated with changes in lipid metabolism in the adult life regardless whether offspring consumed an SRD after weaning.Fil: D´Alessandro, M. E.. Universidad Nacional del Litoral. Facultad de Bioquímica y Ciencias Biológicas. Departamento de Ciencias Biológicas; ArgentinaFil: Rojido, M.. Universidad Nacional del Litoral. Facultad de Bioquímica y Ciencias Biológicas. Departamento de Ciencias Biológicas; ArgentinaFil: Chicco, Adriana Graciela. Universidad Nacional del Litoral. Facultad de Bioquímica y Ciencias Biológicas. Departamento de Ciencias Biológicas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; Argentin
    corecore