109 research outputs found
Quantifying nuisance parameter effects via decompositions of asymptotic refinements for likelihood-based statistics
Recommended from our members
Sit still and pay attention: Using the Wii Balance-Board to detect lapses in concentration in children during psychophysical testing.
During psychophysical testing, a loss of concentration can cause observers to answer incorrectly, even when the stimulus is clearly perceptible. Such lapses limit the accuracy and speed of many psychophysical measurements. This study evaluates an automated technique for detecting lapses based on body movement (postural instability). Thirty-five children (8-11 years of age) and 34 adults performed a typical psychophysical task (orientation discrimination) while seated on a Wii Fit Balance Board: a gaming device that measures center of pressure (CoP). Incorrect responses on suprathreshold catch trials provided the "reference standard" measure of when lapses in concentration occurred. Children exhibited significantly greater variability in CoP on lapse trials, indicating that postural instability provides a feasible, real-time index of concentration. Limitations and potential applications of this method are discussed
Reanalysis of cancer mortality in Japanese A-bomb survivors exposed to low doses of radiation: bootstrap and simulation methods
<p>Abstract</p> <p>Background</p> <p>The International Commission on Radiological Protection (ICRP) recommended annual occupational dose limit is 20 mSv. Cancer mortality in Japanese A-bomb survivors exposed to less than 20 mSv external radiation in 1945 was analysed previously, using a latency model with non-linear dose response. Questions were raised regarding statistical inference with this model.</p> <p>Methods</p> <p>Cancers with over 100 deaths in the 0 - 20 mSv subcohort of the 1950-1990 Life Span Study are analysed with Poisson regression models incorporating latency, allowing linear and non-linear dose response. Bootstrap percentile and Bias-corrected accelerated (BCa) methods and simulation of the Likelihood Ratio Test lead to Confidence Intervals for Excess Relative Risk (ERR) and tests against the linear model.</p> <p>Results</p> <p>The linear model shows significant large, positive values of ERR for liver and urinary cancers at latencies from 37 - 43 years. Dose response below 20 mSv is strongly non-linear at the optimal latencies for the stomach (11.89 years), liver (36.9), lung (13.6), leukaemia (23.66), and pancreas (11.86) and across broad latency ranges. Confidence Intervals for ERR are comparable using Bootstrap and Likelihood Ratio Test methods and BCa 95% Confidence Intervals are strictly positive across latency ranges for all 5 cancers. Similar risk estimates for 10 mSv (lagged dose) are obtained from the 0 - 20 mSv and 5 - 500 mSv data for the stomach, liver, lung and leukaemia. Dose response for the latter 3 cancers is significantly non-linear in the 5 - 500 mSv range.</p> <p>Conclusion</p> <p>Liver and urinary cancer mortality risk is significantly raised using a latency model with linear dose response. A non-linear model is strongly superior for the stomach, liver, lung, pancreas and leukaemia. Bootstrap and Likelihood-based confidence intervals are broadly comparable and ERR is strictly positive by bootstrap methods for all 5 cancers. Except for the pancreas, similar estimates of latency and risk from 10 mSv are obtained from the 0 - 20 mSv and 5 - 500 mSv subcohorts. Large and significant cancer risks for Japanese survivors exposed to less than 20 mSv external radiation from the atomic bombs in 1945 cast doubt on the ICRP recommended annual occupational dose limit.</p
Retroviral Integration Process in the Human Genome: Is It Really Non-Random? A New Statistical Approach
Retroviral vectors are widely used in gene therapy to introduce therapeutic genes into patients' cells, since, once delivered to the nucleus, the genes of interest are stably inserted (integrated) into the target cell genome. There is now compelling evidence that integration of retroviral vectors follows non-random patterns in mammalian genome, with a preference for active genes and regulatory regions. In particular, Moloney Leukemia Virus (MLV)–derived vectors show a tendency to integrate in the proximity of the transcription start site (TSS) of genes, occasionally resulting in the deregulation of gene expression and, where proto-oncogenes are targeted, in tumor initiation. This has drawn the attention of the scientific community to the molecular determinants of the retroviral integration process as well as to statistical methods to evaluate the genome-wide distribution of integration sites. In recent approaches, the observed distribution of MLV integration distances (IDs) from the TSS of the nearest gene is assumed to be non-random by empirical comparison with a random distribution generated by computational simulation procedures. To provide a statistical procedure to test the randomness of the retroviral insertion pattern, we propose a probability model (Beta distribution) based on IDs between two consecutive genes. We apply the procedure to a set of 595 unique MLV insertion sites retrieved from human hematopoietic stem/progenitor cells. The statistical goodness of fit test shows the suitability of this distribution to the observed data. Our statistical analysis confirms the preference of MLV-based vectors to integrate in promoter-proximal regions
A bootstrap approach for assessing the uncertainty of outcome probabilities when using a scoring system
Background: Scoring systems are a very attractive family of clinical predictive models, because the patient score can be calculated without using any data processing system. Their weakness lies in the difficulty of associating a reliable prognostic probability with each score. In this study a bootstrap approach for estimating confidence intervals of outcome probabilities is described and applied to design and optimize the performance of a scoring system for morbidity in intensive care units after heart surgery.
Methods: The bias-corrected and accelerated bootstrap method was used to estimate the 95% confidence intervals of outcome probabilities associated with a scoring system. These confidence intervals were calculated for each score and each step of the scoring-system design by means of one thousand bootstrapped samples. 1090 consecutive adult patients who underwent coronary artery bypass graft were assigned at random to two groups of equal size, so as to define random training and testing sets with equal percentage morbidities. A collection of 78 preoperative, intraoperative and postoperative variables were considered as likely morbidity predictors.
Results: Several competing scoring systems were compared on the basis of discrimination, generalization and uncertainty associated with the prognostic probabilities. The results showed that confidence intervals corresponding to different scores often overlapped, making it convenient to unite and thus reduce the score classes. After uniting two adjacent classes, a model with six score groups not only gave a satisfactory trade-off between discrimination and generalization, but also enabled patients to be allocated to classes, most of which were characterized by well separated confidence intervals of prognostic probabilities.
Conclusions: Scoring systems are often designed solely on the basis of discrimination and generalization characteristics, to the detriment of prediction of a trustworthy outcome probability. The present example demonstrates that using a bootstrap method for the estimation of outcome-probability confidence intervals provides useful additional information about score-class statistics, guiding physicians towards the most convenient model for predicting morbidity outcomes in their clinical context
Homoplasy corrected estimation of genetic similarity from AFLP bands, and the effect of the number of bands on the precision of estimation
AFLP is a DNA fingerprinting technique, resulting in binary band presence–absence patterns, called profiles, with known or unknown band positions. We model AFLP as a sampling procedure of fragments, with lengths sampled from a distribution. Bands represent fragments of specific lengths. We focus on estimation of pairwise genetic similarity, defined as average fraction of common fragments, by AFLP. Usual estimators are Dice (D) or Jaccard coefficients. D overestimates genetic similarity, since identical bands in profile pairs may correspond to different fragments (homoplasy). Another complicating factor is the occurrence of different fragments of equal length within a profile, appearing as a single band, which we call collision. The bias of D increases with larger numbers of bands, and lower genetic similarity. We propose two homoplasy- and collision-corrected estimators of genetic similarity. The first is a modification of D, replacing band counts by estimated fragment counts. The second is a maximum likelihood estimator, only applicable if band positions are available. Properties of the estimators are studied by simulation. Standard errors and confidence intervals for the first are obtained by bootstrapping, and for the second by likelihood theory. The estimators are nearly unbiased, and have for most practical cases smaller standard error than D. The likelihood-based estimator generally gives the highest precision. The relationship between fragment counts and precision is studied using simulation. The usual range of band counts (50–100) appears nearly optimal. The methodology is illustrated using data from a phylogenetic study on lettuce
Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures
Challenges in Survey Research
While being an important and often used research method, survey research has
been less often discussed on a methodological level in empirical software
engineering than other types of research. This chapter compiles a set of
important and challenging issues in survey research based on experiences with
several large-scale international surveys. The chapter covers theory building,
sampling, invitation and follow-up, statistical as well as qualitative analysis
of survey data and the usage of psychometrics in software engineering surveys.Comment: Accepted version of chapter in the upcoming book on Contemporary
Empirical Methods in Software Engineering. Update includes revision of typos
and additional figures. Last update includes fixing two small issues and
typo
Sex-Specific Genetic Structure and Social Organization in Central Asia: Insights from a Multi-Locus Study
In the last two decades, mitochondrial DNA (mtDNA) and the non-recombining portion of the Y chromosome (NRY) have been extensively used in order to measure the maternally and paternally inherited genetic structure of human populations, and to infer sex-specific demography and history. Most studies converge towards the notion that among populations, women are genetically less structured than men. This has been mainly explained by a higher migration rate of women, due to patrilocality, a tendency for men to stay in their birthplace while women move to their husband's house. Yet, since population differentiation depends upon the product of the effective number of individuals within each deme and the migration rate among demes, differences in male and female effective numbers and sex-biased dispersal have confounding effects on the comparison of genetic structure as measured by uniparentally inherited markers. In this study, we develop a new multi-locus approach to analyze jointly autosomal and X-linked markers in order to aid the understanding of sex-specific contributions to population differentiation. We show that in patrilineal herder groups of Central Asia, in contrast to bilineal agriculturalists, the effective number of women is higher than that of men. We interpret this result, which could not be obtained by the analysis of mtDNA and NRY alone, as the consequence of the social organization of patrilineal populations, in which genetically related men (but not women) tend to cluster together. This study suggests that differences in sex-specific migration rates may not be the only cause of contrasting male and female differentiation in humans, and that differences in effective numbers do matter
The Brain Matures with Stronger Functional Connectivity and Decreased Randomness of Its Network
We investigated the development of the brain's functional connectivity throughout the life span (ages 5 through 71 years) by measuring EEG activity in a large population-based sample. Connectivity was established with Synchronization Likelihood. Relative randomness of the connectivity patterns was established with Watts and Strogatz' (1998) graph parameters C (local clustering) and L (global path length) for alpha (∼10 Hz), beta (∼20 Hz), and theta (∼4 Hz) oscillation networks. From childhood to adolescence large increases in connectivity in alpha, theta and beta frequency bands were found that continued at a slower pace into adulthood (peaking at ∼50 yrs). Connectivity changes were accompanied by increases in L and C reflecting decreases in network randomness or increased order (peak levels reached at ∼18 yrs). Older age (55+) was associated with weakened connectivity. Semi-automatically segmented T1 weighted MRI images of 104 young adults revealed that connectivity was significantly correlated to cerebral white matter volume (alpha oscillations: r = 33, p<01; theta: r = 22, p<05), while path length was related to both white matter (alpha: max. r = 38, p<001) and gray matter (alpha: max. r = 36, p<001; theta: max. r = 36, p<001) volumes. In conclusion, EEG connectivity and graph theoretical network analysis may be used to trace structural and functional development of the brain
- …
