Search CORE

1,496 research outputs found

The examination of baseline noise and the impact on the interpretation of low-template DNA samples

Author: Wellner Genevieve A.
Publication venue
Publication date: 22/01/2016
Field of study

It is common practice for DNA STR profiles to be analyzed using an analytical threshold (AT), but as more low template DNA (LT-DNA) samples are tested it has become evident that these thresholds do not adequately separate signal from noise. In order to confidently examine LT-DNA samples, the behavior and characteristics of the background noise of STR profiles must be better understood. Thus, the background noise of single source LT-DNA STR profiles were examined to characterize the noise distribution and determine how it changes with DNA template mass and injection time. Current noise models typically assume the noise is independent of fragment size but, given the tendency of the baseline noise to increase with template amount, it is important to establish whether the baseline noise is randomly found throughout the capillary electrophoresis (CE) run or whether it is situated in specific regions of the electropherogram. While it has been shown that the baseline noise of negative samples does not behave similarly to the baseline noise of profiles generated using optimal levels of DNA, the ATs determined using negative samples have shown to be similar to those developed with near-zero, low template mass samples. The distinction between low-template samples, where the noise is consistent regardless of target mass, and standard samples could be made at approximately 0.063 ng for samples amplified using the Identifiler^TM Plus amplification kit (29 cycle protocol), and injected for 5 and 10 seconds. At amplification target masses greater than 0.063 ng, the average noise peak height increased and began to plateau between 0.5 and 1.0 ng for samples injected for 5 and 10 seconds. To examine the time dependent nature of the baseline noise, the baselines of over 400 profiles were combined onto one axis for each target mass and each injection time. Areas of reproducibly higher noise peak heights were identified as areas of potential non-specific amplified product. When the samples were injected for five seconds, the baseline noise did not appear to be time dependent. However, when the samples were injected for either 10 or 20 seconds, there were three areas that exhibited an increase in noise; these areas were identified at 118 bases in green, 231 bases in yellow, and 106 bases in red. If a probabilistic analysis or AT is to be employed for DNA interpretation, consideration must be given as to how the validation or calibration samples are prepared. Ideally the validation data should include all the variation seen within typical samples. To this end, a study was performed to examine possible sources of variation in the baseline noise within the electropherogram. Specifically, three samples were prepared at seven target masses using four different kit lots, four capillary lots, in four amplification batches or four injection batches. The distribution of the noise peak heights in the blue and green channels for samples with variable capillary lots, amplifications, and injections were similar, but the distribution of the noise heights for samples with variable kit lots was shifted. This shift in the distribution of the samples with variable kit lots was due to the average peak height of the individual kit lots varying by approximately two. The yellow and red channels showed a general agreement between the distributions of the samples run with variable kit lots, amplifications, and injections, but the samples run with various capillary lots had a distribution shifted to the left. When the distribution of the noise height for each capillary was examined, the average peak height variation was less than two RFU between capillary lots. Use of a probabilistic method requires an accurate description of the distribution of the baseline noise. Three distributions were tested: Gaussian, log-normal, and Poisson. The Poisson distribution did not approximate the noise distributions well. The log-normal distribution was a better approximation than the Gaussian resulting in a smaller sum of the residuals squared. It was also shown that the distributions impacted the probability that a peak was noise; though how significant of an impact this difference makes on the final probability of an entire STR profile was not determined and may be of interest for future studies

Boston University Institutional Repository (OpenBU)

Log-concavity and strong log-concavity: a review

Author: Saumard Adrien
Wellner Jon A.
Publication venue
Publication date: 01/01/2014
Field of study

We review and formulate results concerning log-concavity and strong-log-concavity in both discrete and continuous settings. We show how preservation of log-concavity and strongly log-concavity on

\mathbb{R}

under convolution follows from a fundamental monotonicity result of Efron (1969). We provide a new proof of Efron's theorem using the recent asymmetric Brascamp-Lieb inequality due to Otto and Menz (2013). Along the way we review connections between log-concavity and other areas of mathematics and statistics, including concentration of measure, log-Sobolev inequalities, convex geometry, MCMC algorithms, Laplace approximations, and machine learning.Comment: 67 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Nonparametric estimation of multivariate convex-transformed densities

Author: Seregin Arseni
Wellner Jon A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 14/11/2012
Field of study

We study estimation of multivariate densities

p

of the form

p(x)=h(g(x))

for

x\in \mathbb {R}^d

and for a fixed monotone function

h

and an unknown convex function

g

. The canonical example is

h(y)=e^{-y}

for

y\in \mathbb {R}

; in this case, the resulting class of densities [\mathcal {P}(e^{-y})={p=\exp(-g):g is convex}] is well known as the class of log-concave densities. Other functions

h

allow for classes of densities with heavier tails than the log-concave class. We first investigate when the maximum likelihood estimator

\hat{p}

exists for the class

\mathcal {P}(h)

for various choices of monotone transformations

h

, including decreasing and increasing functions

h

. The resulting models for increasing transformations

h

extend the classes of log-convex densities studied previously in the econometrics literature, corresponding to

h(y)=\exp(y)

. We then establish consistency of the maximum likelihood estimator for fairly general functions

h

, including the log-concave class

\mathcal {P}(e^{-y})

and many others. In a final section, we provide asymptotic minimax lower bounds for the estimation of

p

and its vector of derivatives at a fixed point

x_0

under natural smoothness hypotheses on

h

and

g

. The proofs rely heavily on results from convex analysis.Comment: Published in at http://dx.doi.org/10.1214/10-AOS840 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Estimation of a $k$ -monotone density: limit distribution theory and the spline connection

Author: Balabdaoui Fadoua
Wellner Jon A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

We study the asymptotic behavior of the Maximum Likelihood and Least Squares Estimators of a

k

-monotone density

g_0

at a fixed point

x_0

when

k>2

. We find that the

j

th derivative of the estimators at

x_0

converges at the rate

n^{-(k-j)/(2k+1)}

for

j=0,...,k-1

. The limiting distribution depends on an almost surely uniquely defined stochastic process

H_k

that stays above (below) the

k

-fold integral of Brownian motion plus a deterministic drift when

k

is even (odd). Both the MLE and LSE are known to be splines of degree

k-1

with simple knots. Establishing the order of the random gap

\tau_n^+-\tau_n^-

, where

\tau_n^{\pm}

denote two successive knots, is a key ingredient of the proof of the main results. We show that this ``gap problem'' can be solved if a conjecture about the upper bound on the error in a particular Hermite interpolation via odd-degree splines holds.Comment: Published in at http://dx.doi.org/10.1214/009053607000000262 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

GRO.publications (Univ. Göttingen)

Chernoff's density is log-concave

Author: Balabdaoui Fadoua
Wellner Jon A.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2014
Field of study

We show that the density of

Z=\mathop {\operatorname {argmax}}\{W(t)-t^2\}

, sometimes known as Chernoff's density, is log-concave. We conjecture that Chernoff's density is strongly log-concave or "super-Gaussian", and provide evidence in support of the conjecture.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ483 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Base de publications de l'université Paris-Dauphine

Crossref