102 research outputs found

    DeepLINK-T: deep learning inference for time series data using knockoffs and LSTM

    Full text link
    High-dimensional longitudinal time series data is prevalent across various real-world applications. Many such applications can be modeled as regression problems with high-dimensional time series covariates. Deep learning has been a popular and powerful tool for fitting these regression models. Yet, the development of interpretable and reproducible deep-learning models is challenging and remains underexplored. This study introduces a novel method, Deep Learning Inference using Knockoffs for Time series data (DeepLINK-T), focusing on the selection of significant time series variables in regression while controlling the false discovery rate (FDR) at a predetermined level. DeepLINK-T combines deep learning with knockoff inference to control FDR in feature selection for time series models, accommodating a wide variety of feature distributions. It addresses dependencies across time and features by leveraging a time-varying latent factor structure in time series covariates. Three key ingredients for DeepLINK-T are 1) a Long Short-Term Memory (LSTM) autoencoder for generating time series knockoff variables, 2) an LSTM prediction network using both original and knockoff variables, and 3) the application of the knockoffs framework for variable selection with FDR control. Extensive simulation studies have been conducted to evaluate DeepLINK-T's performance, showing its capability to control FDR effectively while demonstrating superior feature selection power for high-dimensional longitudinal time series data compared to its non-time series counterpart. DeepLINK-T is further applied to three metagenomic data sets, validating its practical utility and effectiveness, and underscoring its potential in real-world applications

    Hybrid hierarchical clustering: piecewise aggregate approximation, with applications

    Get PDF
    Piecewise Aggregate Approximation (PAA) provides a powerful yet computationally efficient tool for dimensionality reduction and feature extraction on large datasets compared to previously reported and well-used feature extraction techniques, such as Principal Component Analysis (PCA). Nevertheless, performance can degrade as a result of either regional information insufficiency or over-segmentation, and because of this, additional relatively complex modifications have subsequently been reported, for instance, Adaptive Piecewise Constant Approximation (APCA). To recover some of the simplicity of the original PAA, whilst addressing the known problems, a distance-based Hierarchical Clustering (HC) technique is now proposed to adjust PAA segment frame sizes to focus segment density on information rich data regions. The efficacy of the resulting hybrid HC-PAA methodology is demonstrated using two application case studies on non-time-series data viz. fault detection on industrial gas turbines, and ultrasonic biometric face identification. Pattern recognition results show that the extracted features from the hybrid HC-PAA provide additional benefits with regard to both cluster separation and classification performance, compared to traditional PAA and the APCA alternative. The method is therefore demonstrated to provide a robust readily implemented algorithm for rapid feature extraction and identification for datasets

    Inhibiting Phase Transfer of Protein Nanoparticles by Surface Camouflage-A Versatile and Efficient Protein Encapsulation Strategy

    Get PDF
    Engineering a system with a high mass fraction of active ingredients, especially water-soluble proteins, is still an ongoing challenge. In this work, we developed a versatile surface camouflage strategy that can engineer systems with an ultrahigh mass fraction of proteins. By formulating protein molecules into nanoparticles, the demand of molecular modification was transformed into a surface camouflage of protein nanoparticles. Thanks to electrostatic attractions and van der Waals interactions, we camouflaged the surface of protein nanoparticles through the adsorption of carrier materials. The adsorption of carrier materials successfully inhibited the phase transfer of insulin, albumin, β-lactoglobulin, and ovalbumin nanoparticles. As a result, the obtained microcomposites featured with a record of protein encapsulation efficiencies near 100% and a record of protein mass fraction of 77%. After the encapsulation in microcomposites, the insulin revealed a hypoglycemic effect for at least 14 d with one single injection, while that of insulin solution was only ∼4 h.Peer reviewe

    Deep Learning-Based H-Score Quantification of Immunohistochemistry-Stained Images

    Get PDF
    Immunohistochemistry (IHC) is a well-established and commonly used staining method for clinical diagnosis and biomedical research. In most IHC images, the target protein is conjugated with a specific antibody and stained using diaminobenzidine (DAB), resulting in a brown coloration, whereas hematoxylin serves as a blue counterstain for cell nuclei. The protein expression level is quantified through the H-score, calculated from DAB staining intensity within the target cell region. Traditionally, this process requires evaluation by 2 expert pathologists, which is both time consuming and subjective. To enhance the efficiency and accuracy of this process, we have developed an automatic algorithm for quantifying the H-score of IHC images. To characterize protein expression in specific cell regions, a deep learning model for region recognition was trained based on hematoxylin staining only, achieving pixel accuracy for each class ranging from 0.92 to 0.99. Within the desired area, the algorithm categorizes DAB intensity of each pixel as negative, weak, moderate, or strong staining and calculates the final H-score based on the percentage of each intensity category. Overall, this algorithm takes an IHC image as input and directly outputs the H-score within a few seconds, significantly enhancing the speed of IHC image analysis. This automated tool provides H-score quantification with precision and consistency comparable to experienced pathologists but at a significantly reduced cost during IHC diagnostic workups. It holds significant potential to advance biomedical research reliant on IHC staining for protein expression quantification

    Interface chemistry of contact metals and ferromagnets on the topological insulator Bi2Se3

    Get PDF
    The interface between the topological insulator Bi2Se3 and deposited metal films is investigated using x-ray photoelectron spectroscopy including conventional contact metals (Au, Pd, Cr, and Ir) and magnetic materials (Co, Fe, Ni, Co0.8Fe0.2, and Ni0.8Fe0.2). Au is the only metal to show little or no interaction with the Bi2Se3, with no interfacial layer between the metal and the surface of the TI. The other metals show a range of reaction behaviors with the relative strength of reaction (obtained from the amount of Bi2Se3 consumed during reaction) ordered as: Au < Pd < Ir < Co ≤ CoFe < Ni < Cr < NiFe < Fe, in approximate agreement with the behavior expected from the Gibbs free energies of formation for the alloys formed. Post metallization anneals at 300°C in vacuum were also performed for each interface. Several of the metal films were not stable upon anneal and desorbed from the surface (Au, Pd, Ni, and Ni0.8Fe0.2), while Cr, Fe, Co, and Co0.8Fe0.2 showed accelerated reactions with the underlying Bi2Se3, including inter-diffusion between the metal and Se. Ir was the only metal to remain stable following anneal, showing no significant increase in reaction with the Bi2Se3. This study reveals the nature of the metal-Bi2Se3 interface for a range of metals. The reactions observed must be considered when designing Bi2Se3 based devices

    JWST-TST DREAMS : Sulfur dioxide in the atmosphere of the Neptune-mass planet HAT-P-26 b from NIRSpec G395H transmission spectroscopy

    Get PDF
    Funding: L.A. is supported by Cornell University College of Arts & Sciences Klarman Fellowship. H.R.W. was funded by UK Research and Innovation (UKRI) framework under the UK government’s Horizon Europe funding guarantee for an ERC Starter Grant (grant No. EP/Y006313/1). R.J.M. is supported by NASA through the NASA Hubble Fellowship grant HST-HF2-51513.001, awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS 5-26555. D.R.L. acknowledges support from NASA under award No. 80GSFC24M0006. C.I.C. acknowledges support by NASA Headquarters through an appointment to the NASA Postdoctoral Program at the Goddard Space Flight Center, administered by ORAU through a contract with NASA.We present the James Webb Space Telescope (JWST) transmission spectrum of the exoplanet HAT-P-26 b (18.6 M⊕, 6.33 R⊕), based on a single transit observed with the JWST NIRSpec G395H grating. We detect water vapor (ln = 4.1), carbon dioxide (ln = 85.6), and sulfur dioxide (ln = 13.5) with high confidence, along with marginal indications for hydrogen sulfide and carbon monoxide (ln < 0.5). The detection of SO2 in a warm super-Neptune-sized exoplanet (RP ∼ 6 R⊕) bridges the gap between previous detections in hot Jupiters and sub-Neptunes, highlighting the role of disequilibrium photochemistry across a broad range of exoplanet atmospheres, including those cooler than 1000 K. Our precise measurements of carbon, oxygen, and sulfur indicate an atmospheric metallicity of ∼10× solar and a subsolar C/O ratio. Retrieved molecular abundances are consistent within 2σ with predictions from self-consistent models including photochemistry. The elevated CO2 abundance and possible H2S signal may also reflect sensitivities to the thermal structure, cloud properties, or additional disequilibrium processes such as vertical mixing. We compare the SO2 abundance in HAT-P-26 b with that of 10 other JWST-observed giant exoplanets, and find a correlation with atmospheric metallicity. The trend is consistent with the prediction from I. J. M. Crossfield, showing a steep rise in SO2 abundance at low metallicities, and a more gradual increase beyond 30× solar. This work is part of a series of studies by our JWST Telescope Scientist Team (JWST-TST), in which we use Guaranteed Time Observations to perform Deep Reconnaissance of Exoplanet Atmospheres through Multi-instrument Spectroscopy (DREAMS).Peer reviewe

    The London Classification: Improving Characterization and Classification of Anorectal Function with Anorectal Manometry.

    Get PDF
    PURPOSE OF REVIEW: Objective measurement of anorectal sensorimotor function is a requisite component in the clinical evaluation of patients with intractable symptoms of anorectal dysfunction. Regrettably, the utility of the most established and widely employed investigations for such measurement (anorectal manometry (ARM), rectal sensory testing and the balloon expulsion test) has been limited by wide variations in clinical practice. RECENT FINDINGS: This article summarizes the recently published International Anorectal Physiology Working Group (IAPWG) consensus and London Classification of anorectal disorders, together with relevant allied literature, to provide guidance on the indications for, equipment, protocol, measurement definitions and results interpretation for ARM, rectal sensory testing and the balloon expulsion test. The London Classification is a standardized method and nomenclature for description of alterations in anorectal motor and sensory function using office-based investigations, adoption of which should bring much needed harmonization of practice

    Utilizing a digital cohort to understand the health burden and lifestyle characteristics across the life course in individuals with polycystic ovary syndrome and possible PCOS

    Get PDF
    IntroductionPolycystic ovary syndrome (PCOS) is an ovulation disorder associated with multiple health conditions. This study analyzed health and lifestyle characteristics of those with diagnosed and possible PCOS in a large, digital cohort.MethodsWe analyzed data from female participants who enrolled in the Apple Women’s Health Study-a mobile-application-based cohort in the United States and provided informed consent from 11/14/2019–12/14/2024. Specific analyses were further restricted to those who responded to relevant survey questions. Self-reported sociodemographic, health (conditions and age at diagnosis), and lifestyle characteristics were evaluated, stratified by PCOS status: PCOS (self-reported physician diagnosed PCOS), possible PCOS (self-reported irregular menses and androgen excess), and no PCOS. Among those with PCOS/possible PCOS, we further evaluated potential predictors of not reporting a PCOS diagnosis using multivariable logistic regression.ResultsOf participants providing medical history at enrollment, 12.6% (n=11,022) reported PCOS, and among the subset without a PCOS diagnosis and with relevant survey data, 17.4% (n=7,152) were assigned possible PCOS. The median baseline age was 35 years. Most participants self-identified as non-Hispanic White (74.2%). The possible PCOS group was slightly less educated (≤high school: possible PCOS 14.5%, PCOS 17.3%, no PCOS 14.0%). The PCOS/possible PCOS groups reported lower socioeconomic status (SES) than the no PCOS group (low SES: PCOS 32.7%, possible PCOS 31.6%, no PCOS 23.5%). The PCOS and possible PCOS groups displayed a high burden of disease (cardiometabolic, endometrial hyperplasia/cancer, pregnancy complications, mental health conditions). Compared to those without PCOS, those with PCOS reported less healthy lifestyle behaviors relevant to physical activity/sleep/stress/smoking and more healthy lifestyle behaviors relevant to alcohol intake/diet. The age at diagnosis for multiple health conditions was earlier for participants with PCOS compared to those without PCOS. Young/old age (18 - 29/40–50 years), lower educational attainment, lower SES, and lower BMI were positive predictors of not reporting a PCOS diagnosis.ConclusionsThis study demonstrated significant differences in health and lifestyle characteristics across PCOS status (PCOS, possible PCOS, no PCOS), identifying populations that could benefit from early risk reduction counseling. Our results may inform discussions around clinical care models through improving awareness of health predictors and lifestyle interventions
    corecore