16 research outputs found

    >

    No full text

    A Federated Database for Obesity Research: An IMI-SOPHIA Study.

    No full text
    Obesity is considered by many as a lifestyle choice rather than a chronic progressive disease. The Innovative Medicines Initiative (IMI) SOPHIA (Stratification of Obesity Phenotypes to Optimize Future Obesity Therapy) project is part of a momentum shift aiming to provide better tools for the stratification of people with obesity according to disease risk and treatment response. One of the challenges to achieving these goals is that many clinical cohorts are siloed, limiting the potential of combined data for biomarker discovery. In SOPHIA, we have addressed this challenge by setting up a federated database building on open-source DataSHIELD technology. The database currently federates 16 cohorts that are accessible via a central gateway. The database is multi-modal, including research studies, clinical trials, and routine health data, and is accessed using the R statistical programming environment where statistical and machine learning analyses can be performed at a distance without any disclosure of patient-level data. We demonstrate the use of the database by providing a proof-of-concept analysis, performing a federated linear model of BMI and systolic blood pressure, pooling all data from 16 studies virtually without any analyst seeing individual patient-level data. This analysis provided similar point estimates compared to a meta-analysis of the 16 individual studies. Our approach provides a benchmark for reproducible, safe federated analyses across multiple study types provided by multiple stakeholders

    Sensor combination and chemometric variable selection for online monitoring of Streptomyces coelicolor fed-batch cultivations

    No full text
    Fed-batch cultivations of Streptomyces coelicolor, producing the antibiotic actinorhodin, were monitored online by multiwavelength fluorescence spectroscopy and off-gas analysis. Partial least squares (PLS), locally weighted regression, and multilinear PLS (N-PLS) models were built for prediction of biomass and substrate (casamino acids) concentrations, respectively. The effect of combination of fluorescence and gas analyzer data as well as of different variable selection methods was investigated. Improved prediction models were obtained by combination of data from the two sensors and by variable selection using a genetic algorithm, interval PLS, and the principal variables method, respectively. A stepwise variable elimination method was applied to the three-way fluorescence data, resulting in simpler and more accurate N-PLS models. The prediction models were validated using leave-one-batch-out cross-validation, and the best models had root mean square error of cross-validation values of 1.02 g l(-1) biomass and 0.8 g l(-1) total amino acids, respectively. The fluorescence data were also explored by parallel factor analysis. The analysis revealed four spectral profiles present in the fluorescence data, three of which were identified as pyridoxine, NAD(P)H, and flavin nucleotides, respectively
    corecore