140 research outputs found
The site frequency spectrum of dispensable genes
The differences between DNA-sequences within a population are the basis to
infer the ancestral relationship of the individuals. Within the classical
infinitely many sites model, it is possible to estimate the mutation rate based
on the site frequency spectrum, which is comprised by the numbers
, where n is the sample size and is the number of site
mutations (Single Nucleotide Polymorphisms, SNPs) which are seen in
genomes. Classical results can be used to compare the observed site frequency
spectrum with its neutral expectation, , where
is the scaled site mutation rate. In this paper, we will relax the assumption
of the infinitely many sites model that all individuals only carry homologous
genetic material. Especially, it is today well-known that bacterial genomes
have the ability to gain and lose genes, such that every single genome is a
mosaic of genes, and genes are present and absent in a random fashion, giving
rise to the dispensable genome. While this presence and absence has been
modeled under neutral evolution within the infinitely many genes model in
previous papers, we link presence and absence of genes with the numbers of site
mutations seen within each gene. In this work we derive a formula for the
expectation of the joint gene and site frequency spectrum, denotes
the number of mutated sites occurring in exactly gene sequences, while the
corresponding gene is present in exactly individuals. We show that standard
estimators of for dispensable genes are biased and that the site
frequency spectrum for dispensable genes differs from the classical result.Comment: 24 pages, 8 figure
The infinitely many genes model with horizontal gene transfer
The genome of bacterial species is much more flexible than that of
eukaryotes. Moreover, the distributed genome hypothesis for bacteria states
that the total number of genes present in a bacterial population is greater
than the genome of every single individual. The pangenome, i.e. the set of all
genes of a bacterial species (or a sample), comprises the core genes which are
present in all living individuals, and accessory genes, which are carried only
by some individuals. In order to use accessory genes for adaptation to
environmental forces, genes can be transferred horizontally between
individuals. Here, we extend the infinitely many genes model from Baumdicker,
Hess and Pfaffelhuber (2010) for horizontal gene transfer. We take a
genealogical view and give a construction -- called the Ancestral Gene Transfer
Graph -- of the joint genealogy of all genes in the pangenome. As application,
we compute moments of several statistics (e.g. the number of differences
between two individuals and the gene frequency spectrum) under the infinitely
many genes model with horizontal gene transfer.Comment: 31 pages, 3 figure
The diversity of a distributed genome in bacterial populations
The distributed genome hypothesis states that the set of genes in a
population of bacteria is distributed over all individuals that belong to the
specific taxon. It implies that certain genes can be gained and lost from
generation to generation. We use the random genealogy given by a Kingman
coalescent in order to superimpose events of gene gain and loss along ancestral
lines. Gene gains occur at a constant rate along ancestral lines. We assume
that gained genes have never been present in the population before. Gene losses
occur at a rate proportional to the number of genes present along the ancestral
line. In this infinitely many genes model we derive moments for several
statistics within a sample: the average number of genes per individual, the
average number of genes differing between individuals, the number of
incongruent pairs of genes, the total number of different genes in the sample
and the gene frequency spectrum. We demonstrate that the model gives a
reasonable fit with gene frequency data from marine cyanobacteria.Comment: Published in at http://dx.doi.org/10.1214/09-AAP657 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A note on Willmore minimizing Klein bottles in Euclidean space
We show that Lawson's bipolar surface is after
stereographic projection the unique minimizer among immersed Klein bottles in
its conformal class. We conjecture that it actually is the unique minimizer
among immersed Klein bottles into , , whose existence
the authors and P. Breuning proved in a previous paper.Comment: 8 page
Existence of minimizing Willmore Klein bottles in Euclidean four-space
Let be a Klein bottle. We show that the infimum of the Willmore energy
among all immersed Klein bottles in Euclidean -space is attained by a smooth
embedded Klein bottle, where . There are three distinct regular
homotopy classes of immersed Klein bottles in the Euclidean four-space each one
containing an embedding. One is characterized by the property that it contains
the minimizer just mentioned. For the other two regular homotopy classes we
show that the Willmore energy is bounded from below by . We give a
classification of the minimizers of these two classes. In particular, we prove
the existence of infinitely many distinct embedded Klein bottles in Euclidean
four-space that have Euler normal number or and Willmore energy
. The surfaces are distinct even when we allow conformal transformations
of the ambient space. As they are all minimizers in their homotopy class they
are Willmore surfaces.Comment: final version, to appear in Geometry & Topolog
panX: pan-genome analysis and exploration
Horizontal transfer, gene loss, and duplication result in dynamic bacterial genomes shaped by a complex mixture of different modes of evolution. Closely related strains can differ in the presence or absence of many genes, and the total number of distinct genes found in a set of related isolates-the pan-genome-is often many times larger than the genome of individual isolates. We have developed a pipeline that efficiently identifies orthologous gene clusters in the pan-genome. This pipeline is coupled to a powerful yet easy-to-use web-based visualization for interactive exploration of the pan-genome. The visualization consists of connected components that allow rapid filtering and searching of genes and inspection of their evolutionary history. For each gene cluster, panX displays an alignment, a phylogenetic tree, maps mutations within that cluster to the branches of the tree and infers gain and loss of genes on the core-genome phylogeny. PanX is available at pangenome.de. Custom pan-genomes can be visualized either using a web server or by serving panX locally as a browser-based application
- …
