42 research outputs found

    Establishment and evaluation of a model for clinical feature selection and prediction in gout patients with cardiovascular diseases: a retrospective cohort study

    Get PDF
    BackgroundGout is a chronic inflammatory condition increasingly recognized as a risk factor for cardiovascular events (CVE). Early identification of high-risk individuals is crucial for targeted prevention and management. However, conventional risk stratification approaches often fall short in accuracy and clinical utility. This study aimed to develop and validate a robust, interpretable machine learning (ML)-based model for predicting CVE in patients with gout.MethodsThis retrospective cohort study included 686 hospitalized gout patients at Xiyuan Hospital (Beijing, China) between January 1, 2013, and December 31, 2023. We applied Synthetic Minority Oversampling Technique (SMOTE) combined with random undersampling of the majority class. Then, patients were randomly divided into training (70%) and testing (30%) sets. A comprehensive set of clinical and biochemical variables (n = 39) was collected. Feature selection was performed using Boruta algorithms and Lasso to identify the most predictive variables. Multiple ML algorithms—including Decision Tree Learner, LightGBM Learner, K Nearest Neighbors Learner, CatBoost Learner, Gradient Boosting Desicion Tree Learner—were implemented to construct predictive models. SHAP values were used to assess model interpretability, and robustness was evaluated through 10-fold bootstrap resampling with enhanced standard error estimation.ResultsOf the 686 patients, 263 experienced cardiovascular events during follow-up (incidence rate: 38.3%). A logistic regression model was constructed based on eight variables selected using the Boruta feature selection algorithm: sex, age, PLT, EOS, LYM, CO2, GLU and APO-B. Among the five models evaluated, the CatBoost classifier achieved the best performance, with the highest area under the ROC curve (AUC) of 0.976 and the recall of 0.971. Furthermore, SHAP (SHapley Additive exPlanations) values were employed to provide both global and individual-level interpretability of the CatBoost model. To assess the model’s generalization performance, bootstrap resampling was performed 10 times. Based on these results, the standard error was improved using machine learning-based enhancement methods, thereby optimizing the model’s robustness and predictive stability.ConclusionThe logistic regression analysis revealed that age (OR=1.351, p<0.001), CO2 (OR=0.603, p=0.004), eosinophil count (OR=2.128, p=0.001), and platelet count (OR=0.961, p<0.001) were significantly associated with the outcome, indicating their potential roles as independent predictors. Notably, while APO_B (p=0.138) and sex (p=0.132) showed no significant association, glucose levels (OR=2.1, p=0.066) exhibited a marginal trend toward significance, warranting further investigation. This tool may support clinicians in identifying high-risk individuals, enabling early interventions and optimized management strategies.LimitationsThis study has several limitations. First, the analysis was based on a single-center dataset, which may limit the generalizability of the findings. External validation in multi-center and prospective cohorts, along with an expanded sample size, is warranted to confirm these results. Second, key confounding factors such as medication use, lifestyle habits, and gout flare frequency were not included in the analysis; future studies should incorporate these variables to provide a more comprehensive assessment

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Integrating sugarcane molasses into sequential cellulosic biofuel production based on SSF process of high solid loading

    No full text
    Abstract Background Sugarcane bagasse (SCB) is one of the most promising lignocellulosic biomasses for use in the production of biofuels. However, bioethanol production from pure SCB fermentation is still limited by its high process cost and low fermentation efficiency. Sugarcane molasses, as a carbohydrate-rich biomass, can provide fermentable sugars for ethanol production. Herein, to reduce high processing costs, molasses was integrated into lignocellulosic ethanol production in batch modes to improve the fermentation system and to boost the final ethanol concentration and yield. Results The co-fermentation of pretreated SCB and molasses at ratios of 3:1 (mixture A) and 1:1 (mixture B) were conducted at solid loadings of 12% to 32%, and the fermentation of pretreated SCB alone at the same solid loading was also compared. At a solid loading of 32%, the ethanol concentrations of 64.10 g/L, 74.69 g/L, and 75.64 g/L were obtained from pure SCB, mixture A, and mixture B, respectively. To further boost the ethanol concentration, the fermentation of mixture B (1:1), with higher solid loading from 36 to 48%, was also implemented. The highest ethanol concentration of 94.20 g/L was generated at a high solid loading of 44%, with an ethanol yield of 72.37%. In addition, after evaporation, the wastewater could be converted to biogas by anaerobic digestion. The final methane production of 312.14 mL/g volatile solids (VS) was obtained, and the final chemical oxygen demand removal and VS degradation efficiency was 85.9% and 95.9%, respectively. Conclusions Molasses could provide a good environment for the growth of yeast and inoculum. Integrating sugarcane molasses into sequential cellulosic biofuel production could improve the utilization of biomass resources

    Effects of Metal Chloride Salt Pretreatment and Additives on Enzymatic Hydrolysis of Poplar

    No full text
    Metal chloride salt pretreatment was performed to isolate and convert cellulose to glucose from poplar. A glucose yield of 82.0% ± 0.7 was achieved after 0.05 mol/L AlCl3 pretreatment conducted at 180 °C for 20 min, ascribing to the removal of hemicellulose, the alteration of crystallinity, surface morphology, and the retention of the majority of cellulose. Then, the influence of different additives on glucose yield was assessed, generating the highest glucose yield of 88.5 ± 0.06 with the addition of PEG 8000. Meanwhile, a similar glucose yield of 82.8% ± 0.3 could be obtained with PEG 8000 when hydrolysis time was reduced by a quarter and enzyme dosage by three-quarters. It can be seen that AlCl3 pretreatment is a viable and efficient pretreatment method for poplar, while the addition of PEG 8000 can enhance the enzymatic efficiency and reduce cellulase loading, ascribing to the reservation of free enzyme and enzyme activity in the supernatant and the reduction in surface tension, which provide an idea to improve the economics of the enzymatic conversion of poplar
    corecore