Search CORE

1,318 research outputs found

PERSONAL GENOMES

Author: Church G.
Flicek P.
Mardis E.
Ribbs R.
Publication venue
Publication date: 01/09/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

ncRNA orthologies in the vertebrate lineage.

Author: Flicek P
Gordon L
Herrero J
Muffato M
Pignatelli M
Vilella AJ
White S
Publication venue
Publication date: 15/03/2016
Field of study

Annotation of orthologous and paralogous genes is necessary for many aspects of evolutionary analysis. Methods to infer these homology relationships have traditionally focused on protein-coding genes and evolutionary models used by these methods normally assume the positions in the protein evolve independently. However, as our appreciation for the roles of non-coding RNA genes has increased, consistently annotated sets of orthologous and paralogous ncRNA genes are increasingly needed. At the same time, methods such as PHASE or RAxML have implemented substitution models that consider pairs of sites to enable proper modelling of the loops and other features of RNA secondary structure. Here, we present a comprehensive analysis pipeline for the automatic detection of orthologues and paralogues for ncRNA genes. We focus on gene families represented in Rfam and for which a specific covariance model is provided. For each family ncRNA genes found in all Ensembl species are aligned using Infernal, and several trees are built using different substitution models. In parallel, a genomic alignment that includes the ncRNA genes and their flanking sequence regions is built with PRANK. This alignment is used to create two additional phylogenetic trees using the neighbour-joining (NJ) and maximum-likelihood (ML) methods. The trees arising from both the ncRNA and genomic alignments are merged using TreeBeST, which reconciles them with the species tree in order to identify speciation and duplication events. The final tree is used to infer the orthologues and paralogues following Fitch's definition. We also determine gene gain and loss events for each family using CAFE. All data are accessible through the Ensembl Comparative Genomics ('Compara') API, on our FTP site and are fully integrated in the Ensembl genome browser, where they can be accessed in a user-friendly manner.Database URL: http://www.ensembl.org

UCL Discovery

PubMed Central

Variant calling on the GRCh38 assembly with the data from phase three of the 1000 Genomes Project

Author: 1000 Genomes Project C.
Clarke L.
Fairley S.
Flicek P.
Lowy-Gallego E.
Ruffier M.
Timmermann B.
Zheng-Bradley X.
Publication venue: 'F1000 Research Ltd'
Publication date: 11/03/2019
Field of study

We present biallelic SNVs called from 2,548 samples across 26 populationsfrom the 1000 Genomes Project, called directly on GRCh38. We believethis will be a useful reference resource for those using GRCh38,representing an improvement over the “lift-overs” of the 1000 GenomesProject data that have been available to date and providing a resourcenecessary for the full adoption of GRCh38 by the community. Here, wedescribe how the call set was created and provide benchmarking datadescribing how our call set compares to that produced by the final phase ofthe 1000 Genomes Project on GRCh37

MPG.PuRe

How and why DNA barcodes underestimate the diversity of microbial eukaryotes

Author: Adam Eyre-Walker
AR Boyko
AZ Worden
AZ Worden
B Charlesworth
B Palenik
DT Jones
F Not
G Piganeau
Gwenael Piganeau
Hervé Moreau
J Coyne
J Crow
JJ Welch
K Romari
M Viprey
ML Cuvelier
Nigel Grimsley
P Flicek
P Lopez-Garcia
PD Keightley
Purification Lopez-Garcia
S Gourbiere
S Jancek
S Proost
SB Needleman
SJ Williamson
SL Baldauf
SY Moon-van der Staay
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/02/2011
Field of study

Background: Because many picoplanktonic eukaryotic species cannot currently be maintained in culture, direct sequencing of PCR-amplified 18S ribosomal gene DNA fragments from filtered sea-water has been successfully used to investigate the astounding diversity of these organisms. The recognition of many novel planktonic organisms is thus based solely on their 18S rDNA sequence. However, a species delimited by its 18S rDNA sequence might contain many cryptic species, which are highly differentiated in their protein coding sequences. Principal Findings: Here, we investigate the issue of species identification from one gene to the whole genome sequence. Using 52 whole genome DNA sequences, we estimated the global genetic divergence in protein coding genes between organisms from different lineages and compared this to their ribosomal gene sequence divergences. We show that this relationship between proteome divergence and 18S divergence is lineage dependant. Unicellular lineages have especially low 18S divergences relative to their protein sequence divergences, suggesting that 18S ribosomal genes are too conservative to assess planktonic eukaryotic diversity. We provide an explanation for this lineage dependency, which suggests that most species with large effective population sizes will show far less divergence in 18S than protein coding sequences. Conclusions: There is therefore a trade-off between using genes that are easy to amplify in all species, but which by their nature are highly conserved and underestimate the true number of species, and using genes that give a better description of the number of species, but which are more difficult to amplify. We have shown that this trade-off differs between unicellular and multicellular organisms as a likely consequence of differences in effective population sizes. We anticipate that biodiversity of microbial eukaryotic species is underestimated and that numerous ''cryptic species'' will become discernable with the future acquisition of genomic and metagenomic sequences

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online

Genomics of Divergence along a Continuum of Parapatric Population Differentiation

MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1)

OceanRep

Crossref

Directory of Open Access Journals

PubMed Central

Queen Mary Research Online

Bern Open Repository and Information System (BORIS)

MPG.PuRe

The Francis Crick Institute

A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes

Author: A Auton
A Kong
A Navarro
A Necşulea
A Ratnakumar
A Siepel
Adam Siepel
AJ Jeffreys
AJ Webb
AP Boyle
BC Lamb
C Kosiol
CC Spencer
CF Mugal
D Karolchik
D Kostka
Dennis Kostka
E Mancera
G Marais
Graham Coop
J Berglund
J Harrow
J Romiguier
JA Capra
JM Chen
John A. Capra
JW IJdo
K Lindblad-Toh
K Pollard
Katherine S. Pollard
L Arbiza
L Duret
L Duret
LR Meyer
M Blanchette
M Hasegawa
Melissa J. Hubisz
MJ Hubisz
N Galtier
N Galtier
N Lartillot
P Flicek
P Stenson
RD George
S Glémin
S Katzman
S Katzman
S Myers
S Myers
SE Ptak
ST Sherry
T Nagylaki
TC Brown
TR Dreszer
W Winckler
Y Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts about 1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. They are also significantly enriched for disease-associated polymorphisms, suggesting that they contribute to the fixation of deleterious alleles. The gBGC tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages. They supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available. © 2013 Capra et al

arXiv.org e-Print Archive

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

D-Scholarship@Pitt

The Francis Crick Institute

Quantitative analysis of chromatin interaction changes upon a 4.3 Mb deletion at mouse 4E2

Author: de Laat W.
de Wit E.
Eckersley-Maslin M. A.
Eils R.
Flicek P.
Harder N.
Mills A.
Mukhopadhyay S.
Ried T.
Rohr K.
Sengupta A. M.
Spector D. L.
Splinter E.
Wong E. S.
Zepeda-Mendoza C. J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

BACKGROUND: Circular chromosome conformation capture (4C) has provided important insights into three dimensional (3D) genome organization and its critical impact on the regulation of gene expression. We developed a new quantitative framework based on polymer physics for the analysis of paired-end sequencing 4C (PE-4Cseq) data. We applied this strategy to the study of chromatin interaction changes upon a 4.3 Mb DNA deletion in mouse region 4E2. RESULTS: A significant number of differentially interacting regions (DIRs) and chromatin compaction changes were detected in the deletion chromosome compared to a wild-type (WT) control. Selected DIRs were validated by 3D DNA FISH experiments, demonstrating the robustness of our pipeline. Interestingly, significant overlaps of DIRs with CTCF/Smc1 binding sites and differentially expressed genes were observed. CONCLUSIONS: Altogether, our PE-4Cseq analysis pipeline provides a comprehensive characterization of DNA deletion effects on chromatin structure and function

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Utrecht University Repository

University of Melbourne Institutional Repository

UQ eSpace (University of Queensland)

De novo mutations in SMCHD1 cause Bosma arhinia microphthalmia syndrome and abrogate nasal development

Author: A Javed
A McKenna
A-V Gendrel
A-V Gendrel
Abdelaziz Sefiani
Alex C Magee
Alexandra D Gurzau
Asif Javed
Audrey S M Teo
AW Mould
Axel M Hillmer
B Brasseur
Bernd Wollnik
Bruno Reversade
C Bock
Camille Dion
Carine Bonnard
Chalermpong Chatdokmaiprai
Christine Bole-Feysot
Christopher T Gordon
Denise Williams
Dieter Meschede
Duangrurdee Wattanasirichaigoon
F Magdinier
Frédérique Magdinier
GA Van der Auwera
Gökhan Tunçbilek
Gökhan Yigit
H Coker
H Li
H Mishima
Hallvard Reigstad
Hicham Filali
HL Szabo-Rogers
Holger Thiele
Hülya Kayserili
IA Adzhubei
Ilham Ratbi
J Harrow
James M Murphy
Janine Altmüller
JB Tryggestad
JC de Greef
JD Thompson
JE Hewitt
Jeanne Amiel
JM Graham Jr.
K Chen
K Chen
K Wang
Kelan Chen
Koh-ichiro Yoshiura
L-C Li
LA Kelley
M Tang
M-C Gaillard
M-C Gaillard
MA DePristo
Marnie E Blewitt
MD Nickell
ME Blewitt
Meriem Fikri
Michael L Cunningham
Mung Kei Kong
Myriam Oufadem
N Rosin
Nadine Rosin
Nawfal Fejjal
Nicola Ragge
Nicolas Lévy
NJ Brideau
Nobuhiko Okamoto
NS Verkaik
P Flicek
P Kumar
Patrick Nitschké
PE Forni
Peter Nürnberg
R Dutta
R-S Nozawa
Rachel Irving
RJLF Lemmers
RJLF Lemmers
RJLF Lemmers
Ruth McGowan
S Faisal Ahmed
S Sacconi
Sabine Sigaudy
Shifeng Xue
Siham Chafai Elalaoui
ST Sherry
Stanislas Lyonnet
Tamara J Beck
Vinod Varghese
Wolfgang Mühlbauer
X Liu
X Liu
X Robert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Bosma arhinia microphthalmia syndrome (BAMS) is an extremely rare and striking condition characterized by complete absence of the nose with or without ocular defects. We report here that missense mutations in the epigenetic regulator SMCHD1 mapping to the extended ATPase domain of the encoded protein cause BAMS in all 14 cases studied. All mutations were de novo where parental DNA was available. Biochemical tests and in vivo assays in Xenopus laevis embryos suggest that these mutations may behave as gain-of-function alleles. This finding is in contrast to the loss-of-function mutations in SMCHD1 that have been associated with facioscapulohumeral muscular dystrophy (FSHD) type 2. Our results establish SMCHD1 as a key player in nasal development and provide biochemical insight into its enzymatic function that may be exploited for development of therapeutics for FSHD

Hacettepe University Institutional Repository

Crossref

Kölner UniversitätsPublikationsServer

HAL AMU

GRO.publications

HAL-Inserm

GRO.publications (Univ. Göttingen)

Enlighten

HAL: Hyper Article en Ligne

Hacettepe University Reserach Information System

Recommended from our members