205 research outputs found
A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies
Figure S1. A graphical display of various hypotheses of interest. (TIFF 149 kb
A Survival-Adjusted Quantal-Response Test for Analysis of Tumor Incidence Rates in Animal Carcinogenicity Studies
In rodent cancer bioassays, groups of animals are exposed to different doses of a chemical of interest and followed for tumor occurrence. The resulting tumor rates are commonly analyzed using a survival-adjusted Cochran-Armitage (CA) trend test. The CA trend test has reasonable power when the tumor-response curve is linear in dose, but it may be underpowered for a nonlinear response. An alternative survival-adjusted test procedure based on isotonic regression methodology has previously been proposed. Although this alternative procedure performs well when the tumor response is nonlinear in dose, it has less power than the CA trend test when the response is linear in dose. Here, we introduce a new survival-adjusted test procedure that makes use of both the CA trend test and the isotonic regression-based trend test. Using a broad range of experimental conditions typical of National Toxicology Program (NTP) bioassays, we conducted extensive computer simulations to compare the false-positive error rate and power of the proposed procedure with the survival-adjusted CA trend test. The new procedure competes well with the survival-adjusted CA trend test when observed tumor rates are linear in dose and performs substantially better when observed tumor rates are nonlinear in dose. Further, the proposed trend test almost always has a smaller false-positive rate than does the survival-adjusted CA trend test. We also developed an order-restricted inference-based procedure for performing multiple pairwise comparisons between each of the dose groups and the control group. The trend test and the multiple pairwise comparisons test are demonstrated using an example from a study conducted by the NTP
CLME: An R Package for Linear Mixed Effects Models under Inequality Constraints
In many applications researchers are typically interested in testing for inequality constraints in the context of linear fixed effects and mixed effects models. Although there exists a large body of literature for performing statistical inference under inequality constraints, user friendly statistical software implementing such methods is lacking, especially in the context of linear fixed and mixed effects models. In this article we introduce CLME, a package in the R language that can be used for testing a broad collection of inequality constraints. It uses residual bootstrap based methodology which is reasonably robust to non-normality as well as heteroscedasticity. The package is illustrated using two data sets. The package also contains a graphical user interface built using the shiny package
Identification of a core set of signature cell cycle genes whose relative order of time to peak expression is conserved across species
A cell division cycle is a well-coordinated process in eukaryotes with cell cycle genes exhibiting a periodic expression over time. There is considerable interest among cell biologists to determine genes that are periodic in multiple organisms and whether such genes are also evolutionarily conserved in their relative order of time to peak expression. Interestingly, periodicity is not well-conserved evolutionarily. A conservative estimate of a number of periodic genes common to fission yeast (Schizosaccharomyces pombe) and budding yeast (Saccharomyces cerevisiae) (‘core set FB’) is 35, while those common to fission yeast and humans (Homo sapiens) (‘core set FH’) is 24. Using a novel statistical methodology, we discover that the relative order of peak expression is conserved in ∼80% of FB genes and in ∼40% of FH genes. We also discover that the order is evolutionarily conserved in six genes which are potentially the core set of signature cell cycle genes. These include ace2 (a transcription factor) and polo-kinase plo1, which are well-known hubs of early M-phase clusters, cdc18 a key component of pre-replication complexes, mik1 which is critical for the establishment and maintenance of DNA damage check point, and histones hhf1 and hta2
A Minimal Set of Tissue-Specific Hypomethylated CpGs Constitute Epigenetic Signatures of Developmental Programming
Background: Cell specific states of the chromatin are programmed during mammalian development. Dynamic DNA methylation across the developing embryo guides a program of repression, switching off genes in most cell types. Thus, the majority of the tissue specific differentially methylated sites (TS-DMS) must be un-methylated CpGs. Methodology and Principal Findings Comparison of expanded Methyl Sensitive Cut Counting data (eMSCC) among four tissues (liver, testes, brain and kidney) from three C57BL/6J mice, identified 138,052 differentially methylated sites of which 23,270 contain CpGs un-methylated in only one tissue (TS-DMS). Most of these CpGs were located in intergenic regions, outside of promoters, CpG islands or their shores, and up to 20% of them overlapped reported active enhancers. Indeed, tissue-specific enhancers were up to 30 fold enriched in TS-DMS. Testis showed the highest number of TS-DMS, but paradoxically their associated genes do not appear to be specific to the germ cell functions, but rather are involved in organism development. In the other tissues the differentially methylated genes are associated with tissue-specific physiological or anatomical functions. The identified sets of TS-DMS quantify epigenetic distances between tissues, generated during development. We applied this concept to measure the extent of reprogramming in the liver of mice exposed to in utero or early postnatal nutritional stress. Different protocols of food restriction reprogrammed the liver methylome in different but reproducible ways. Conclusion and Significance Thus, each identified set of differentially methylated sites constituted an epigenetic signature that traced the developmental programing or the early nutritional reprogramming of each exposed mouse. We propose that our approach has the potential to outline a number of disease-associated epigenetic states. The composition of differentially methylated CpGs may vary with each situation, behaving as a composite variable, which can be used as a pre-symptomatic marker for disease
Statistical inference under order restrictions on both rows and columns of a matrix, with an application in toxicology
We present a general methodology for performing statistical inference on the
components of a real-valued matrix parameter for which rows and columns are
subject to order restrictions. The proposed estimation procedure is based on an
iterative algorithm developed by Dykstra and Robertson (1982) for simple order
restriction on rows and columns of a matrix. For any order restrictions on rows
and columns of a matrix, sufficient conditions are derived for the algorithm to
converge in a single application of row and column operations. The new
algorithm is applicable to a broad collection of order restrictions. In
practice, it is easy to design a study such that the sufficient conditions
derived in this paper are satisfied. For instance, the sufficient conditions
are satisfied in a balanced design. Using the estimation procedure developed in
this article, a bootstrap test for order restrictions on rows and columns of a
matrix is proposed. Computer simulations for ordinal data were performed to
compare the proposed test with some existing test procedures in terms of size
and power. The new methodology is illustrated by applying it to a set of
ordinal data obtained from a toxicological study.Comment: Published in at http://dx.doi.org/10.1214/193940307000000059 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Phase analysis of circadian-related genes in two tissues
BACKGROUND: Recent circadian clock studies using gene expression microarray in two different tissues of mouse have revealed not all circadian-related genes are synchronized in phase or peak expression times across tissues in vivo. Instead, some circadian-related genes may be delayed by 4–8 hrs in peak expression in one tissue relative to the other. These interesting biological observations prompt a statistical question regarding how to distinguish the synchronized genes from genes that are systematically lagged in phase/peak expression time across two tissues. RESULTS: We propose a set of techniques from circular statistics to analyze phase angles of circadian-related genes in two tissues. We first estimate the phases of a cycling gene separately in each tissue, which are then used to estimate the paired angular difference of the phase angles of the gene in the two tissues. These differences are modeled as a mixture of two von Mises distributions which enables us to cluster genes into two groups; one group having synchronized transcripts with the same phase in the two tissues, the other containing transcripts with a discrepancy in phase between the two tissues. For each cluster of genes we assess the association of phases across the tissue types using circular-circular regression. We also develop a bootstrap methodology based on a circular-circular regression model to evaluate the improvement in fit provided by allowing two components versus a one-component von-Mises model. CONCLUSION: We applied our proposed methodologies to the circadian-related genes common to heart and liver tissues in Storch et al. [2], and found that an estimated 80% of circadian-related transcripts common to heart and liver tissues were synchronized in phase, and the other 20% of transcripts were lagged about 8 hours in liver relative to heart. The bootstrap p-value for being one cluster is 0.063, which suggests the possibility of two clusters. Our methodologies can be extended to analyze peak expression times of circadian-related genes across more than two tissues, for example, kidney, heart, liver, and the suprachiasmatic nuclei (SCN) of the hypothalamus
A response to information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments
- …
