Search CORE

280 research outputs found

Subset Quantile Normalization using Negative Control Features

Author: Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 26/06/2009
Field of study

Collection Of Biostatistics Research Archive

Prey capture and meat-eating by the wild colobus monkey _Rhinopithecus bieti_ in Yunnan, China

Author: Baoping Ren
Dayong Li
Hua Wu
Ming Li
Zhijin Liu
Publication venue
Publication date: 03/04/2009
Field of study

If it is true that extant primates evolved from an insectivorous ancestor, then primate entomophagy would be a primitive trait. Many taxa, however, have undergone a dietary shift from entomophagy to phytophagy, evolving a specialised gut and dentition and becoming exclusive herbivores. The exclusively herbivorous taxa are the Malagasy families Indriidae and Lepilemuridae, and the Old World Monkey subfamily Colobinae, and among these meat-eating has not been observed except as an anomaly, with the sole exception of the Hanuman langur (_Semnopithecus entellus_), which feeds on insects seasonally, and a single observation of a nestling bird predated by wild Sichuan snub-nosed monkeys (_Rhinopithecus roxellana_). Here, we describe the regular capture of warm-blooded animals and the eating of meat by a colobine, the critically endangered Yunnan snub-nosed monkey (_Rhinopithecus bieti_). This monkey engages in scavenge hunting as a male-biased activity that may, in fact, be related to group structure and spatial spread. In this context, meat-eating can be regarded as an energy/nutrient maximization feeding strategy rather than as a consequence of any special characteristic of meat itself. The finding of meat-eating in forest-dwelling primates might provide new insights into the evolution of dietary habits in early humans

Nature Precedings

Stochastic Models Based on Molecular Hybridization Theory for Short Oligonucleotide Microarrays

Author: Irizarry Rafael A
LeBlanc Richard
Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 29/09/2003
Field of study

High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. Affymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, non-specific hybridization, probe-specific effects, and measurement error, ad-hoc measures of expression, that summarize probe intensities, can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the ad-hoc procedure offered by Affymetrix. Recently, physical models based on molecular hybridization theory, have been proposed as useful tools for prediction of, for example, non-specific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper we demonstrate that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models and we propose empirically motivated stochastic models that compliment the above mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts

Collection Of Biostatistics Research Archive

FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYMETRIX GENECHIP CONTROL DATASET

Author: Cope Leslie
Irizarry Rafael A
Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 17/03/2006
Field of study

We describe why the Choe et al. control dataset should not be used to assess GeneChip expression measures

Collection Of Biostatistics Research Archive

A Statistical Framework for the Analysis of Microarray Probe-Level Data

Author: Irizarry Rafael A
Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/03/2005
Field of study

Microarrays are an example of the powerful high through-put genomics tools that are revolutionizing the measurement of biological systems. In this and other technologies, a number of critical steps are required to convert the raw measures into the data relied upon by biologists and clinicians. These data manipulations, referred to as preprocessing, have enormous influence on the quality of the ultimate measurements and studies that rely upon them. Many researchers have previously demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of gene expression measurements, relative to ad-hoc procedures introduced by designers and manufacturers of the technology. However, further substantial improvements are possible. Microarrays are now being used to measure diverse high genomic endpoints including yeast mutant representations, the presence of SNPs, presence of deletions/insertions, and protein binding sites by chromatin immunoprecipitation (known as ChIP-chip). In each case, the genomic units of measurement are relatively short DNA molecules referred to as probes. Without appropriate understanding of the bias and variance of these measurements, biological inferences based upon probe analysis will be compromised. Standard operating procedure for microarray researchers is to use preprocessed data as the starting point for the statistical analyses that produce reported results. This has prevented many researchers from carefully considering their choice of preprocessing methodology. Furthermore, the fact that the preprocessing step greatly affects the stochastic properties of the final statistical summaries is ignored. In this paper we propose a statistical framework that permits the integration of preprocessing into the standard statistical analysis flow of microarray data. We demonstrate its usefulness by applying the idea in three different applications of the technology

Collection Of Biostatistics Research Archive

Comparison of Affymetrix GeneChip Expression Measures

Author: Irizarry Rafael A
Jaffee Harris A.
Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/09/2005
Field of study

Affymetrix GeneChip expression array technology has become a standard tool in medical science and basic biology research. In this system, preprocessing occurs before one obtains expression level measurements. Because the number of competing preprocessing methods was large and growing, in the summer of 2003 we developed a benchmark to help users of the technology identify the best method for their application. In conjunction with the release of a Bioconductor R package (affycomp), a webtool was made available for developers of preprocessing methods to submit them to a benchmark for comparison. There have now been over 30 methods compared via the webtool. Results: Background correction, one of the main steps in preprocessing, has the largest effect on performance. In particular, background correction appears to improve accuracy but, in general, worsen precision. The benchmark results put this balance in perspective. Furthermore, we have improved some of the original benchmark metrics to provide more detailed information regarding accuracy and precision. A handful of methods stand out as maintaining a useful balance. The affycomp package, now version 1.5.2, continues to be available as part of the Bioconductor project (http://www.bioconductor.org). The webtool continues to be available at http://affycomp.biostat.jhsph.edu

Collection Of Biostatistics Research Archive

A Model Based Background Adjustment for Oligonucleotide Expression Arrays

Author: Gentleman Robert
Irizarry Rafael A
Murillo Francisco Martinez
Spencer Forrest
Wu Zhijin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 28/05/2004
Field of study

High density oligonucleotide expression arrays are widely used in many areas of biomedical research. Affymetrix GeneChip arrays are the most popular. In the Affymetrix system, a fair amount of further pre-processing and data reduction occurs following the image processing step. Statistical procedures developed by academic groups have been successful at improving the default algorithms provided by the Affymetrix system. In this paper we present a solution to one of the pre-processing steps, background adjustment, based on a formal statistical framework. Our solution greatly improves the performance of the technology in various practical applications. Affymetrix GeneChip arrays use short oligonucleotides to probe for genes in an RNA sample. Typically each gene will be represented by 11-20 pairs of oligonucleotide probes. The first component of these pairs is referred to as a perfect match probe and is designed to hybridize only with transcripts from the intended gene (specific hybridization). However, hybridization by other sequences (non-specific hybridization) is unavoidable. Furthermore, hybridization strengths are measured by a scanner that introduces optical noise. Therefore, the observed intensities need to be adjusted to give accurate measurements of specific hybridization. One approach to adjusting is to pair each perfect match probe with a mismatch probe that is designed with the intention of measuring non-specific hybridization. The default adjustment, provided as part of the Affymetrix system, is based on the difference between perfect match and mismatch probe intensities. We have found that this approach can be improved via the use of estimators derived from a statistical model that use probe sequence information. The model is based on simple hybridization theory from molecular biology and experiments specifically designed to help develop it. A final step in the pre-processing of these arrays is to combine the 11-20 probe pair intensities, after background adjustment and normalization, for a given gene to define a measure of expression that represents the amount of the corresponding mRNA species. In this paper we illustrate the practical consequences of not adjusting appropriately for the presence of nonspecific hybridization and provide a solution based on our background adjustment procedure. Software that computes our adjustment is available as part of the Bioconductor project (http://www.bioconductor

Collection Of Biostatistics Research Archive

A statistical framework for the analysis of microarray probe-level data

Author: Irizarry Rafael A.
Wu Zhijin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

In microarray technology, a number of critical steps are required to convert the raw measurements into the data relied upon by biologists and clinicians. These data manipulations, referred to as preprocessing, influence the quality of the ultimate measurements and studies that rely upon them. Standard operating procedure for microarray researchers is to use preprocessed data as the starting point for the statistical analyses that produce reported results. This has prevented many researchers from carefully considering their choice of preprocessing methodology. Furthermore, the fact that the preprocessing step affects the stochastic properties of the final statistical summaries is often ignored. In this paper we propose a statistical framework that permits the integration of preprocessing into the standard statistical analysis flow of microarray data. This general framework is relevant in many microarray platforms and motivates targeted analysis methods for specific applications. We demonstrate its usefulness by applying the idea in three different applications of the technology.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS116 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref