650 research outputs found
MergeMaid: R Tools for Merging and Cross-Study Validation of Gene Expression Data
Cross-study validation of gene expression investigations is critical in genomic analysis. We developed an R package and associated object definitions to merge and visualize multiple gene expression datasets. Our merging functions use arbitrary character IDs and generate objects that can efficiently support a variety of joint analyses. Visualization tools support exploration and cross-study validation of the data, without requiring normalization across platforms. Tools include “integrative correlation” plots that is, scatterplots of all pairwise correlations in one study against the corresponding pairwise correlations of another, both for individual genes and all genes combined. Gene-specific plots can be used to identify genes whose changes are reliably measured across studies. Visualizations also include scatterplots of gene-specific statistics quantifying relationships between expression and phenotypes of interest, using linear, logistic and Cox regression. Availability: Free open source from url http://www.bioconductor.org. Contact: Xiaogang Zhong [email protected] Supplementary information: Documentation available with the package
Sparse Estimation of Cox Proportional Hazards Models via Approximated Information Criteria
We propose a new sparse estimation method for Cox (1972) proportional hazards models by optimizing an approximated information criterion. The main idea involves approximation of the inline image norm with a continuous or smooth unit dent function. The proposed method bridges the best subset selection and regularization by borrowing strength from both. It mimics the best subset selection using a penalized likelihood approach yet with no need of a tuning parameter. We further reformulate the problem with a reparameterization step so that it reduces to one unconstrained nonconvex yet smooth programming problem, which can be solved efficiently as in computing the maximum partial likelihood estimator (MPLE). Furthermore, the reparameterization tactic yields an additional advantage in terms of circumventing postselection inference. The oracle property of the proposed method is established. Both simulated experiments and empirical examples are provided for assessment and illustration
OPTIMIZED CROSS-STUDY ANALYSIS OF MICROARRAY-BASED PREDICTORS
Background: Microarray-based gene expression analysis is widely used in cancer research to discover molecular signatures for cancer classification and prediction. In addition to numerous independent profiling projects, a number of investigators have analyzed multiple published data sets for purposes of cross-study validation. However, the diverse microarray platforms and technical approaches make direct comparisons across studies difficult, and without means to identify aberrant data patterns, less than optimal. To address this issue, we previously developed an integrative correlation approach to systematically address agreement of gene expression measurements across studies, providing a basis for cross-study validation analysis. Here we generalize this methodology to provide a metric for evaluating the overall efficacy of preprocessing and cross-referencing, and explore optimal combinations of filtering and cross-referencing strategies. We operate in the context of validating prognostic breast cancer gene expression signatures on data reported by three different groups, each using a different platform.
Results: To evaluate overall cross-platform reproducibility in the context of a specific prediction problem, we suggest integrative association, that is the cross-study correlation of gene-specific measure of association with the phenotype predicted. Specifically, in this paper we use the correlation among the Cox proportional hazard coefficients for association of gene expression to relapse free survival (RFS). Gene filtering by integrative correlation to select reproducible genes emerged as the key factor to increase the integrative association, while alternative methods of gene cross-referencing and gene filtering proved only to modestly improve the overall reproducibility. Patient selection was another major factor affecting the validation process. In particular, in one of the studies considered, gene expression association with RFS varied across subsets of patients that differ by their ascertainment criteria. One of the subsets proved to be highly consistent with other studies, while others showed significantly lower consistency. Third, as expected, use of cluster-specific mean expression profiles in the Cox model yielded more generalizable results than expression data from individual genes. Finally, by using our approach we were able to validate the association between the breast cancer molecular classes proposed by Sorlie et al. and RFS.
Conclusions: This paper provides a simple, practical and comprehensive technique for measuring consistency of molecular classification results across microarray platforms, without requiring subjective judgments about membership of samples in putative clusters. This methodology will be of value in consistently typing breast and other cancers across different studies and platforms in the future. Although the tumor subtypes considered here have been previously validated by their proponents, this is the first independent validation, and the first to include the Affymetrix platform
Evolutions of helical edge states in disordered HgTe/CdTe quantum wells
We study the evolutions of the nonmagnetic disorder-induced edge states with
the disorder strength in the HgTe/CdTe quantum wells. From the supercell band
structures and wave-functions, it is clearly shown that the conducting helical
edge states, which are responsible for the reported quantized conductance
plateau, appear above a critical disorder strength after a gap-closing phase
transition. These edge states are then found to decline with the increase of
disorder strength in a stepwise pattern due to the finite-width effect, where
the opposite edges couple with each other through the localized states in the
bulk. This is in sharp contrast with the localization of the edge states
themselves if magnetic disorders are doped which breaks the time-reversal
symmetry. The size-independent boundary of the topological phase is obtained by
scaling analysis, and an Anderson transition to an Anderson insulator at even
stronger disorder is identified, in-between of which, a metallic phase is found
to separate the two topologically distinct phases.Comment: 7 pages, 5 figure
Modeling and analysis of energy distribution networks using switched differential systems
It is a pleasure to dedicate this contribution to Prof. Arjan van der Schaft on the occasion of his 60th birthday. We study the dynamics of energy distribution networks consisting of switching power converters and multiple (dis-)connectable modules. We use parsimonious models that deal effectively with the variant complexity of the network and the inherent switching phenomena induced by power converters. We also present the solution to instability problems caused by devices with negative impedance characteristics such as constant power loads. Elements of the behavioral system theory such as linear differential behaviors and quadratic differential forms are crucial in our analysis
Evaluation of ocean color remote sensing algorithms for diffuse attenuation coefficients and optical depths with data collected on BGC-Argo floats
The vertical distribution of irradiance in the ocean is a key input to quantify processes spanning from radiative warming, photosynthesis to photo-oxidation. Here we use a novel dataset of thousands local-noon downwelling irradiance at 490 nm (Ed(490) and photosynthetically available radiation (PAR) profiles captured by 103 BGC-Argo floats spanning three years (from October 2012 to January 2016) in the world\u27s ocean, to evaluate several published algorithms and satellite products related to diffuse attenuation coefficient (Kd). Our results show: (1) MODIS-Aqua Kd(490) products derived from a blue-to-green algorithm and two semi-analytical algorithms show good consistency with the float-observed values, but the Chla-based one has overestimation in oligotrophic waters; (2) The Kd(PAR) model based on the Inherent Optical Properties (IOPs) performs well not only at sea-surface but also at depth, except for the oligotrophic waters where Kd(PAR) is underestimated below two penetration depth (2zpd), due to the model\u27s assumption of a homogeneous distribution of IOPs in the water column which is not true in most oligotrophic waters with deep chlorophyll-a maxima; (3) In addition, published algorithms for the 1% euphotic-layer depth and the depth of 0.415 mol photons m-2 d-1 isolume are evaluated. Algorithms based on Chla generally work well while IOPs-based ones exhibit an overestimation issue in stratified and oligotrophic waters, due to the underestimation of Kd(PAR) at depth
Attributing variations of temporal and spatial groundwater recharge: a statistical analysis of climatic and non-climatic factors
This paper demonstrated the benefits of statistical methods when investigating the climatic and non-climatic drivers responsible for variations in groundwater recharge with a series of up to 43 years of annual recharge for 426 bores in South-East South Australia. We identified the factors influencing groundwater recharge based on 71 climatic metrics and 13 non-climatic metrics (including groundwater abstraction). The results showed: 1) Rainfall during April to October was the most important variable influencing recharge temporal variation, with its decline identified as the most significant factor related to recharge reduction; 2) In contrast, a negative correlation between rainfall during December to February (DJF) and annual groundwater recharge was found. This suggests that a seasonal shift in rainfall (such as decreasing rainfall during April to October and an increase during DJF) can result in a decline in recharge even when the annual rainfall remains unchanged; 3) The length of wet spells (consecutive rain days) and increasing PET were additional significant predictors for recharge temporal variation. It demonstrated that a simple empirical relationship (such as recharge as a fixed percentage of rainfall) is not a reliable estimation of renewable groundwater resources under changing climatic conditions; 4) There is a statistically significant spatial correlation between mean groundwater depth and recharge, and this implies that a reduction in rainfall can lead to a positive feedback loop of declining recharge and water level; 5) Spatially the most statistically significant factors influencing groundwater recharge were soil types and land attributes. The findings of this study can identify which stressors should be included when investigating the impact of climate change on groundwater recharge
Editorial: Electromyography (EMG) Techniques for the Assessment and Rehabilitation of Motor Impairment Following Stroke
- …
