Search CORE

189 research outputs found

Choosing summary statistics by least angle regression for approximate Bayesian computation

Author: Andreas Futschik
Beaumont M.A.
Blum M.G.B.
Breiman L.
Hudson R.R.
Ijaz Hussain
Joyce P.
Marjoram P.
Mitwali Abd-el.Moemen
Muhammad Faisal
Nunes M.A.
Publication venue: 'Informa UK Limited'
Publication date: 16/12/2015
Field of study

YesBayesian statistical inference relies on the posterior distribution. Depending on the model, the posterior can be more or less difficult to derive. In recent years, there has been a lot of interest in complex settings where the likelihood is analytically intractable. In such situations, approximate Bayesian computation (ABC) provides an attractive way of carrying out Bayesian inference. For obtaining reliable posterior estimates however, it is important to keep the approximation errors small in ABC. The choice of an appropriate set of summary statistics plays a crucial role in this effort. Here, we report the development of a new algorithm that is based on least angle regression for choosing summary statistics. In two population genetic examples, the performance of the new algorithm is better than a previously proposed approach that uses partial least squares.Higher Education Commission (HEC), College Deanship of Scientific Research, King Saud University, Riyadh Saudi Arabia - research group project RGP-VPP-280

Crossref

Bradford Scholars

Global parameter identification of stochastic reaction networks from single trajectories

Author: A Auger
AL Barabási
B Munsky
C Andrieu
CL Müller
DT Gillespie
DT Gillespie
E Cinquemani
ET Jaynes
G Kjellström
G Kjellström
G Kjellström
G Stock
H Kitano
H Kitano
H Koeppl
H Qian
JA Helmuth
JK Pritchard
JR Lakowicz
K Koutroumpas
L Ljung
M Hafner
O Mason
O Wolkenhauer
P Marjoram
R Albert
R Grima
R Ramaswamy
R Ramaswamy
R Ramaswamy
R Ramaswamy
R Ramaswamy
RJ Boys
S Reinker
SH Strogatz
SK Poovathingal
T Toni
TG Kurtz
Publication venue
Publication date: 01/01/2011
Field of study

We consider the problem of inferring the unknown parameters of a stochastic biochemical network model from a single measured time-course of the concentration of some of the involved species. Such measurements are available, e.g., from live-cell fluorescence microscopy in image-based systems biology. In addition, fluctuation time-courses from, e.g., fluorescence correlation spectroscopy provide additional information about the system dynamics that can be used to more robustly infer parameters than when considering only mean concentrations. Estimating model parameters from a single experimental trajectory enables single-cell measurements and quantification of cell--cell variability. We propose a novel combination of an adaptive Monte Carlo sampler, called Gaussian Adaptation, and efficient exact stochastic simulation algorithms that allows parameter identification from single stochastic trajectories. We benchmark the proposed method on a linear and a non-linear reaction network at steady state and during transient phases. In addition, we demonstrate that the present method also provides an ellipsoidal volume estimate of the viable part of parameter space and is able to estimate the physical volume of the compartment in which the observed reactions take place.Comment: Article in print as a book chapter in Springer's "Advances in Systems Biology

arXiv.org e-Print Archive

Crossref

ZORA

Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks

Author: Andrieu C.
Andrieu C.
Beaumont M. A.
Besag J.
Besag J.
Besag J.
Caimo A.
Carter C.
Del Moral P.
Frank O.
Friel N.
Geyer C. J.
Geyer C. J.
Green P. J.
Grelaud A.
Hamze F.
Higdon D. M.
Koskinen J. H.
Marjoram P.
Murray I.
Murray I.
Møller J.
Neal R.
Pritchard J. K.
Propp J. G.
Richard G. Everitt
Robert C. P.
Sisson S. A.
Snijders T. A. B.
Tierney L.
Wasserman S.
Publication venue
Publication date: 01/01/2012
Field of study

Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.Comment: 26 pages, 2 figures, accepted in Journal of Computational and Graphical Statistics (http://www.amstat.org/publications/jcgs.cfm

arXiv.org e-Print Archive

Central Archive at the University of Reading

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

MSMC and MSMC2: the multiple sequentially markovian coalescent

Author: 1001 Genomes Consortium
AS Malaspinas
CM Hung
GAT McVean
L Pagani
L Pagani
LAF Frantz
M Malinsky
M. Raghavan
P Marjoram
S Mallick
S Schiffels
TM Beissinger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The Multiple Sequentially Markovian Coalescent (MSMC) is a population genetic method and software for inferring demographic history and population structure through time from genome sequences. Here we describe the main program MSMC and its successor MSMC2. We go through all the necessary steps of processing genomic data from BAM files all the way to generating plots of inferred population size and separation histories. Some background on the methodology itself is provided, as well as bash scripts and python source code to run the necessary programs. The reader is also referred to community resources such as a mailing list and github repositories for further advice

Crossref

MPG.PuRe

A minimal descriptor of an ancestral recombinations graph

Author: Asif Javed
B Padhukasahasram
C Wiuf
GAT McVean
GK Chen
J Hein
L L Liang
L Parida
L Parida
Laxmi Parida
M Arenas
M Jobling
P Marjoram
Pier Francesco Palamara
R Bürger
RC Griffiths
RR Hudson
RR Hudson
S Schaffner
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Ancestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations. Results In this paper, we identify structure-preserving and samples-preserving core of an ARG <it>G</it> and call it the minimal descriptor ARG of <it>G</it>. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of <it>G</it> and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of <it>G</it>. We also prove that even an unbounded <it>G</it> has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of <it>G</it> and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential. Conclusions Based on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Probabilistic machine learning and artificial intelligence.

Author: A Doucet
A Gelman
A Korattikara
A Krizhevsky
A O'Hagan
A Pfeffer
A Pfeffer
A Pfeffer
B Bakker
B De Finetti
B Fischer
B Milch
B Paige
C Freer
C Kemp
C Lu
C Shannon
C Thornton
CE Rasmussen
CE Rasmussen
CE Rasmussen
CM Bishop
CM Bishop
D Koller
D Koller
D Wingate
DE Wolstenholme
DJ Hand
DJ Lunn
DJC MacKay
DM Wolpert
DR Jones
ET Jaynes
F Wood
F Wood
G Hinton
GE Hinton
GF Marcus
H Kushner
H Robbins
I Sutskever
J Bergstra
J Hensman
J Snoek
JB Tenenbaum
JM Hernández-Lobato
JR Lloyd
K Doya
K Miller
KP Murphy
KS Van Horn
L Li
LR Rabiner
M Girolami
M Hoffman
M Jordan
M Medvedovic
M Schmidt
M Welling
MI Jordan
MP Deisenroth
N Goodman
N Hjort
N Houlsby
ND Goodman
ND Goodman
P Diaconis
P Hennig
P Marjoram
P Orbanz
P Poupart
P Sermanet
RB Grosse
RD King
RM Neal
RM Neal
RM Neal
RM Neal
RP Adams
RT Cox
S Deneve
S Russell
S Thrun
SJ Russell
TL Griffiths
TL Griffiths
TP Minka
TP Minka
TS Ferguson
V Mansinghka
WH Jefferys
Y Bengio
YW Teh
Z Ghahramani
Publication venue: 'The Nature Conservancy'
Publication date: 01/05/2015
Field of study

How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.The author acknowledges an EPSRC grant EP/I036575/1, the DARPA PPAML programme, a Google Focused Research Award for the Automatic Statistician and support from Microsoft Research.This is the author accepted manuscript. The final version is available from NPG at http://www.nature.com/nature/journal/v521/n7553/full/nature14541.html#abstract

Crossref

Apollo (Cambridge)

ABCtoolbox: a versatile toolkit for approximate Bayesian computations

Author: C Leuenberger
CCK Boyce
Christoph Leuenberger
D Wegmann
Daniel Wegmann
G Hamilton
G Heckel
G Laval
G Weiss
JK Pritchard
JM Cornuet
JS Lopes
K Thornton
L Excoffier
Laurent Excoffier
M Beaumont
M Chadeau-Hyam
M Currat
M Schweizer
M Tenehaus
MA Beaumont
MA Beaumont
O Ratmann
P Bortot
P Marjoram
P Marjoram
RR Hudson
S Braaker
S Fink
S Tavaré
SA Sisson
SA Sisson
Samuel Neuenschwander
T Toni
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

BACKGROUND: The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. RESULTS: Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. CONCLUSION: ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results

Crossref

Springer - Publisher Connector

UNIL IRIS | Institutional Research Information System

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Bern Open Repository and Information System (BORIS)

A Simulated Annealing Approach to Approximate Bayes Computations

Author: Andreas Scheidegger
B Andresen
C Leuenberger
Carlo Albert
G Ruppeiner
G Weiss
Hans R. Künsch
JM Marin
L Onsager
MA Beaumont
MH Rubin
MM Tanaka
N Metropolis
P Fearnhead
P Marjoram
P Salamon
RD Wilkinson
S Tavaré
T Toni
U Seifert
W Spirkl
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Approximate Bayes Computations (ABC) are used for parameter inference when the likelihood function of the model is expensive to evaluate but relatively cheap to sample from. In particle ABC, an ensemble of particles in the product space of model outputs and parameters is propagated in such a way that its output marginal approaches a delta function at the data and its parameter marginal approaches the posterior distribution. Inspired by Simulated Annealing, we present a new class of particle algorithms for ABC, based on a sequence of Metropolis kernels, associated with a decreasing sequence of tolerances w.r.t. the data. Unlike other algorithms, our class of algorithms is not based on importance sampling. Hence, it does not suffer from a loss of effective sample size due to re-sampling. We prove convergence under a condition on the speed at which the tolerance is decreased. Furthermore, we present a scheme that adapts the tolerance and the jump distribution in parameter space according to some mean-fields of the ensemble, which preserves the statistical independence of the particles, in the limit of infinite sample size. This adaptive scheme aims at converging as close as possible to the correct result with as few system updates as possible via minimizing the entropy production in the system. The performance of this new class of algorithms is compared against two other recent algorithms on two toy examples.Comment: 20 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Evaluation of methods for detecting conversion events in gene clusters

Author: A Siepel
A Siepel
C Hsu
C Spencer
C Strope
Cathy Riemer
Chih-Hao Hsu
D Husmeier
D Martin
D Martin
D Posada
E Holmes
G Hellenthal
Giltae Song
J Archer
J Archibald
J Chen
J Hein
J Huelsenbeck
J Kim
J Smith
J Stoye
K Lole
L Excoffier
L Liang
M Arenas
M Arenas
M Boni
M Gibbs
M Hasegawa
M Rosenberg
M Suchard
N Grassly
O Westesson
P Marjoram
R Cartwright
R Harris
R Hudson
S Pond
S Sawyer
S Schaffner
T Mailund
V Minin
W Miller
Webb Miller
Y Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but their performance has not been assessed for wide deployment in evolutionary history studies due to a lack of accurate evaluation methods. Results: We designed a new method that simulates gene cluster evolution, including large-scale events of duplication, deletion, and conversion as well as small mutations. We used this simulation data to evaluate several different programs for detecting gene conversion events. Conclusions: Our evaluation identifies strengths and weaknesses of several methods for detecting gene conversion, which can contribute to more accurate analysis of gene cluster evolution

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

High DNA Methylation Pattern Intratumoral Diversity Implies Weak Selection in Many Human Colorectal Cancers

Author: A Marusyk
A Umar
B Charlesworth
B Drewinko
Chris T. L. Chan
D Shibata
D Shibata
Darryl Shibata
E Quintana
ED Pleasance
FA Sinicrope
GK Dy
I Bozic
J Felsenstein
JH Hoeijmakers
KD Siegmund
Kimberly D. Siegmund
KM Drescher
L Ding
LA Norton
M Kimura
N Navin
NP Bhattacharyya
Paul Marjoram
PC Nowell
PJ Campbell
R Bernards
RP Hill
S Jones
S Yachida
Simon Tavaré
T Sjöblom
TL Wang
Y Hüsemann
YJ Hong
Publication venue: Public Library of Science
Publication date
Field of study

It is possible to infer the past of populations by comparing genomes between individuals. In general, older populations have more genomic diversity than younger populations. The force of selection can also be inferred from population diversity. If selection is strong and frequently eliminates less fit variants, diversity will be limited because new, initially homogeneous populations constantly emerge.Here we translate a population genetics approach to human somatic cancer cell populations by measuring genomic diversity within and between small colorectal cancer (CRC) glands. Control tissue culture and xenograft experiments demonstrate that the population diversity of certain passenger DNA methylation patterns is reduced after cloning but subsequently increases with time. When measured in CRC gland populations, passenger methylation diversity from different parts of nine CRCs was relatively high and uniform, consistent with older, stable lineages rather than mixtures of younger homogeneous populations arising from frequent cycles of selection. The diversity of six metastases was also high, suggesting dissemination early after transformation. Diversity was lower in DNA mismatch repair deficient CRC glands, possibly suggesting more selection and the elimination of less fit variants when mutation rates are elevated.The many hitchhiking passenger variants observed in primary and metastatic CRC cell populations are consistent with relatively old populations, suggesting that clonal evolution leading to selective sweeps may be rare after transformation. Selection in human cancers appears to be a weaker than presumed force after transformation, consistent with the observed rarity of driver mutations in cancer genomes. Phenotypic plasticity rather than the stepwise acquisition of new driver mutations may better account for the many different phenotypes within human tumors

Crossref

Directory of Open Access Journals

PubMed Central