1,768 research outputs found
Guerre des sexes chez une fourmi : reproduction clonale des mâles et des reines
NewsSCOPUS: no.jinfo:eu-repo/semantics/publishe
Reliable ABC model choice via random forests
Approximate Bayesian computation (ABC) methods provide an elaborate approach
to Bayesian inference on complex models, including model choice. Both
theoretical arguments and simulation experiments indicate, however, that model
posterior probabilities may be poorly evaluated by standard ABC techniques. We
propose a novel approach based on a machine learning tool named random forests
to conduct selection among the highly complex models covered by ABC algorithms.
We thus modify the way Bayesian model selection is both understood and
operated, in that we rephrase the inferential goal as a classification problem,
first predicting the model that best fits the data with random forests and
postponing the approximation of the posterior probability of the predicted MAP
for a second stage also relying on random forests. Compared with earlier
implementations of ABC model choice, the ABC random forest approach offers
several potential improvements: (i) it often has a larger discriminative power
among the competing models, (ii) it is more robust against the number and
choice of statistics summarizing the data, (iii) the computing effort is
drastically reduced (with a gain in computation efficiency of at least fifty),
and (iv) it includes an approximation of the posterior probability of the
selected model. The call to random forests will undoubtedly extend the range of
size of datasets and complexity of models that ABC can handle. We illustrate
the power of this novel methodology by analyzing controlled experiments as well
as genuine population genetics datasets. The proposed methodologies are
implemented in the R package abcrf available on the CRAN.Comment: 39 pages, 15 figures, 6 table
Bayesian computation via empirical likelihood
Approximate Bayesian computation (ABC) has become an essential tool for the
analysis of complex stochastic models when the likelihood function is
numerically unavailable. However, the well-established statistical method of
empirical likelihood provides another route to such settings that bypasses
simulations from the model and the choices of the ABC parameters (summary
statistics, distance, tolerance), while being convergent in the number of
observations. Furthermore, bypassing model simulations may lead to significant
time savings in complex models, for instance those found in population
genetics. The BCel algorithm we develop in this paper also provides an
evaluation of its own performance through an associated effective sample size.
The method is illustrated using several examples, including estimation of
standard distributions, time series, and population genetics models.Comment: 21 pages, 12 figures, revised version of the previous version with a
new titl
ABC random forests for Bayesian parameter inference
This preprint has been reviewed and recommended by Peer Community In
Evolutionary Biology (http://dx.doi.org/10.24072/pci.evolbiol.100036).
Approximate Bayesian computation (ABC) has grown into a standard methodology
that manages Bayesian inference for models associated with intractable
likelihood functions. Most ABC implementations require the preliminary
selection of a vector of informative statistics summarizing raw data.
Furthermore, in almost all existing implementations, the tolerance level that
separates acceptance from rejection of simulated parameter values needs to be
calibrated. We propose to conduct likelihood-free Bayesian inferences about
parameters with no prior selection of the relevant components of the summary
statistics and bypassing the derivation of the associated tolerance level. The
approach relies on the random forest methodology of Breiman (2001) applied in a
(non parametric) regression setting. We advocate the derivation of a new random
forest for each component of the parameter vector of interest. When compared
with earlier ABC solutions, this method offers significant gains in terms of
robustness to the choice of the summary statistics, does not depend on any type
of tolerance level, and is a good trade-off in term of quality of point
estimator precision and credible interval estimations for a given computing
time. We illustrate the performance of our methodological proposal and compare
it with earlier ABC methods on a Normal toy example and a population genetics
example dealing with human population evolution. All methods designed here have
been incorporated in the R package abcrf (version 1.7) available on CRAN.Comment: Main text: 24 pages, 6 figures Supplementary Information: 14 pages, 5
figure
Anarchy in the UK: Detailed genetic analysis of worker reproduction in a naturally occurring British anarchistic honeybee, Apis mellifera, colony using DNA microsatellites
Anarchistic behaviour is a very rare phenotype of honeybee colonies. In an anarchistic colony,
many workers’ sons are reared in the presence of the queen. Anarchy has previously
been described in only two Australian colonies. Here we report on a first detailed genetic
analysis of a British anarchistic colony. Male pupae were present in great abundance above
the queen excluder, which was clearly indicative of extensive worker reproduction and is the
hallmark of anarchy. Seventeen microsatellite loci were used to analyse these male pupae,
allowing us to address whether all the males were indeed workers’ sons, and how many
worker patrilines and individual workers produced them. In the sample, 95 of 96 of the
males were definitely workers’ sons. Given that
≈
1% of workers’ sons were genetically
indistinguishable from queen’s sons, this suggests that workers do not move any
queen-laid eggs between the part of the colony where the queen is present to the area above
the queen excluder which the queen cannot enter. The colony had 16 patrilines, with an
effective number of patrilines of 9.85. The 75 males that could be assigned with certainty to
a patriline came from 7 patrilines, with an effective number of 4.21. They were the offspring of at least 19 workers. This is in contrast to the two previously studied Australian naturally occurring anarchist colonies, in which most of the workers’ sons were offspring of one patriline. The high number of patrilines producing males leads to a low mean relatedness between laying workers and males of the colony. We discuss the importance of studying such colonies in the understanding of worker policing and its evolution
Chloroplast microsatellites: measures of genetic diversity and the effect of homoplasy
Chloroplast microsatellites have been widely used in population genetic
studies of conifers in recent years. However, their haplotype configurations
suggest that they could have high levels of homoplasy, thus limiting the power
of these molecular markers. A coalescent-based computer simulation was used to
explore the influence of homoplasy on measures of genetic diversity based on
chloroplast microsatellites. The conditions of the simulation were defined to
fit isolated populations originating from the colonization of one single
haplotype into an area left available after a glacial retreat. Simulated data
were compared with empirical data available from the literature for a species
of Pinus that has expanded north after the Last Glacial Maximum. In the
evaluation of genetic diversity, homoplasy was found to have little influence
on Nei's unbiased haplotype diversity (H(E)) while Goldstein's genetic distance
estimates (D2sh) were much more affected. The effect of the number of
chloroplast microsatellite loci for evaluation of genetic diversity is also
discussed
Inferring introduction routes of invasive species using approximate Bayesian computation on microsatellite data
Determining the routes of introduction provides not only information about the history of an invasion process, but also information about the origin and construction of the genetic composition of the invading population. It remains difficult, however, to infer introduction routes from molecular data because of a lack of appropriate methods. We evaluate here the use of an approximate Bayesian computation (ABC) method for estimating the probabilities of introduction routes of invasive populations based on microsatellite data. We considered the crucial case of a single source population from which two invasive populations originated either serially from a single introduction event or from two independent introduction events. Using simulated datasets, we found that the method gave correct inferences and was robust to many erroneous beliefs. The method was also more efficient than traditional methods based on raw values of statistics such as assignment likelihood or pairwise F(ST). We illustrate some of the features of our ABC method, using real microsatellite datasets obtained for invasive populations of the western corn rootworm, Diabrotica virgifera virgifera. Most computations were performed with the DIYABC program (http://www1.montpellier.inra.fr/CBGP/diyabc/)
Likelihood-free model choice
Fan, and Beaumont (2017). Beyond exposing the potential pitfalls of ABC approximations to posterior probabilities, the review emphasizes mostly the solution proposed by [25] on the use of random forests for aggregating summary statistics and for estimating the posterior probability of the most likely model via a secondary random forest
- …
