Search CORE

482 research outputs found

DnaSP v5: A software for comprehensive analysis of DNA polymorphism data

Author: Excoffier
Hutter
J. Rozas
Nielsen
P. Librado
Rosenberg
Rozas
Scheet
Shendure
Stephens
Stephens
Tajima
Vingron
Wang
Young
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/04/2014
Field of study

Podeu consultar el programari a: http://hdl.handle.net/2445/53451DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser

Crossref

Secretaría de Estado de Cultura

Diposit Digital de la Universitat de Barcelona

Use of partial least squares regression to impute SNP genotypes in Italian Cattle breeds

Author: AJ Chamberlain
APW de Roos
BJ Hayes
BJ Hayes
BJ Hayes
BL Browning
C Dimauro
C Hagger
Corrado Dimauro
D Boichard
D Segelke
DP Berry
G Li
G Moser
Gabriele Marras
GCB Schopen
Giustino Gaspa
H Abdi
HA Mulder
HD Daetwyler
I Medugorac
J Chen
JE Pryce
JM Hickey
K Kizilkaya
KA Weigel
KA Weigel
Massimo Cellesi
Nicolò PP Macciotta
P Ajmone-Marsan
P Scheet
Paolo Ajmone-Marsan
PM VanRaden
R Dassonneville
R Dassonneville
Roberto Steri
T Druet
T Druet
TH Meuwissen
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used. Methods Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content. Results In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip. Conclusions Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available

CiteSeerX

Crossref

PubliCatt

Springer - Publisher Connector

PubMed Central

CINECA IRIS Institutial research information system UNISS

UnissResearch

HapTree: A Novel Bayesian Framework for Single Individual Polyplotyping Using NGS Data

Author: A Efros
A Williams
BL Browning
Bonnie Berger
D Aguiar
D Aguiar
D He
Deniz Yorukoglu
E Berger
Emily Berger
F Geraci
G Abecasis
Isidore Rigoutsos
Jian Peng
K Zhang
M Stephens
O Delaneau
P Scheet
R Lippert
SR Browning
V Bansal
V Bansal
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2013
Field of study

As the more recent next-generation sequencing (NGS) technologies provide longer read sequences, the use of sequencing datasets for complete haplotype phasing is fast becoming a reality, allowing haplotype reconstruction of a single sequenced genome. Nearly all previous haplotype reconstruction studies have focused on diploid genomes and are rarely scalable to genomes with higher ploidy. Yet computational investigations into polyploid genomes carry great importance, impacting plant, yeast and fish genomics, as well as the studies of the evolution of modern-day eukaryotes and (epi)genetic interactions between copies of genes. In this paper, we describe a novel maximum-likelihood estimation framework, HapTree, for polyploid haplotype assembly of an individual genome using NGS read datasets. We evaluate the performance of HapTree on simulated polyploid sequencing read data modeled after Illumina sequencing technologies. For triploid and higher ploidy genomes, we demonstrate that HapTree substantially improves haplotype assembly accuracy and efficiency over the state-of-the-art; moreover, HapTree is the first scalable polyplotyping method for higher ploidy. As a proof of concept, we also test our method on real sequencing data from NA12878 (1000 Genomes Project) and evaluate the quality of assembled haplotypes with respect to trio-based diplotype annotation as the ground truth. The results indicate that HapTree significantly improves the switch accuracy within phased haplotype blocks as compared to existing haplotype assembly methods, while producing comparable minimum error correction (MEC) values. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5.National Science Foundation (U.S.) (NSF/NIH BIGDATA Grant R01GM108348-01)National Science Foundation (U.S.) (Graduate Research Fellowship)Simons Foundatio

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

Scans for signatures of selection in Russian cattle breed genomes reveal new candidate genes for environmental adaptation and acclimation

Author: A Talenti
A Yurchenko
A Zrhidri
AGT Pereira
AK Lindholm-Perry
AR Boyko
AS Wilkins
B Cannon
B Dorshorst
B Grisart
B Haase
B Loureiro
B Loureiro
B Loureiro
BG Oliver
BS Weir
CB Kaelin
D Boruszewska
D Wright
D Yang
DR Schrider
EA Ostrander
EM Ibeagha-Awemu
F Li
F Schlamp
F Tajima
FB Axelrod
G Valverde
H Li
H Li
H Mannen
H Pausch
H Yamada
H Zhang
HD Daetwyler
HD Daetwyler
HP Jedema
I Kurth
I Mathieson
I Naka
I Urbinati
J Kim
J Martin-Tereso
J Queiros
JD Jensen
JD Storey
JE Decker
JJ Simoni Gouveia de
JK Pickrell
K Kim
K Konczol
K Soini
K Wimmers
KC Wollenberg Valero
KE Lotterhos
L Ma
LA Raven
M Cohen-Zinder
M Knoll
M Nei
M Nizon
M Saatchi
MI Fariello
MJ Emmett
MN Weedon
MR Upadhyay
MRS Fortes
NA Mandal
O Delaneau
O Tange
P Danecek
P Scheet
Q Qiu
QL Meng
R Verity
R Weikard
R Xiang
RL Minster
RR Mota
S Boitard
S Bolormaa
S Bongiorni
S Fan
S Makvandi-Nejad
S Moon
S Purcell
S Roth
S Roy
S Sasaki
S Wu
SD Berry
SH Carroll
SJ Yue
SR Grossman
T Iso-Touru
T Nishimaki
TY Yeh
W Barendse
X Zheng
X Zheng
XL Wang
Y Gao
Y Gao
Y Liu
Y Ma
Y Qin
Y Wang
YT Utsunomiya
Z Gu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2018
Field of study

Domestication and selective breeding has resulted in over 1000 extant cattle breeds. Many of these breeds do not excel in important traits but are adapted to local environments. These adaptations are a valuable source of genetic material for efforts to improve commercial breeds. As a step toward this goal we identified candidate regions to be under selection in genomes of nine Russian native cattle breeds adapted to survive in harsh climates. After comparing our data to other breeds of European and Asian origins we found known and novel candidate genes that could potentially be related to domestication, economically important traits and environmental adaptations in cattle. The Russian cattle breed genomes contained regions under putative selection with genes that may be related to adaptations to harsh environments (e.g., AQP5, RAD50, and RETREG1). We found genomic signatures of selective sweeps near key genes related to economically important traits, such as the milk production (e.g., DGAT1, ABCG2), growth (e.g., XKR4), and reproduction (e.g., CSF2). Our data point to candidate genes which should be included in future studies attempting to identify genes to improve the extant breeds and facilitate generation of commercial breeds that fit better into the environments of Russia and other countries with similar climates

Crossref

ZENODO

Directory of Open Access Journals

Dryad Digital Repository

RVC Repository (Royal Veterinary College)

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Enlighten

Haplotype frequencies in a sub-region of chromosome 19q13.3, related to risk and prognosis of cancer, differ dramatically between ethnic groups

Author: A Helgason
A Vangsted
Anne Tjønneland
BA Nexo
BA Nexo
Bjørn A Nexø
CD Bustamante
CF Skjelbred
DJ Park
E Rockenbauer
G Gibson
GV Kryukov
Heng Li
J Novembre
J Yin
JC Barrett
Jun Wang
KA Frazer
KA Olaussen
Lars Bolund
M Dybdahl
MI McCarthy
Mikkel H Schierup
MJ Laska
P Scheet
P Sulem
PL Balaresque
R Blekhman
SB Gabriel
T Mailund
Thomas Mailund
U Vogel
U Vogel
U Vogel
Ulla Vogel
V Moreno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Abstract Background A small region of about 70 kb on human chromosome 19q13.3 encompasses 4 genes of which 3, <it>ERCC1</it>, <it>ERCC2</it>, and <it>PPP1R13L </it>(aka <it>RAI</it>) are related to DNA repair and cell survival, and one, <it>CD3EAP</it>, aka <it>ASE1</it>, may be related to cell proliferation. The whole region seems related to the cellular response to external damaging agents and markers in it are associated with risk of several cancers. Methods We downloaded the genotypes of all markers typed in the 19q13.3 region in the HapMap populations of European, Asian and African descent and inferred haplotypes. We combined the European HapMap individuals with a Danish breast cancer case-control data set and inferred the association between HapMap haplotypes and disease risk. Results We found that the susceptibility haplotype in our European sample had increased from 2 to 50 percent very recently in the European population, and to almost the same extent in the Asian population. The cause of this increase is unknown. The maximal proportion of overall genetic variation due to differences between groups for Europeans versus Africans and Europeans versus Asians (the Fst value) closely matched the putative location of the susceptibility variant as judged from haplotype-based association mapping. Conclusion The combined observation that a common haplotype causing an increased risk of cancer in Europeans and a high differentiation between human populations is highly unusual and suggests a causal relationship with a recent increase in Europeans caused either by genetic drift overruling selection against the susceptibility variant or a positive selection for the same haplotype. The data does not allow us to distinguish between these two scenarios. The analysis suggests that the region is not involved in cancer risk in Africans and that the susceptibility variants may be more finely mapped in Asian populations.</p

Crossref

Roskilde Universitet

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Syddansk Universitets Forskerportal

Online Research Database In Technology

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes

Author: A Kong
A Kong
BN Howie
BP Kinghorn
Brian P Kinghorn
Bruce Tier
D Habier
GK Chen
HD Daetwyler
James F Wilson
JM Hickey
John M Hickey
Julius HJ van der Werf
KA Weigel
Neil Dunstan
P Scheet
PM VanRaden
R McQuillan
R Villa-Angulo
S MacEachern
SR Browning
Y Li
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Abstract Background Knowing the phase of marker genotype data can be useful in genome-wide association studies, because it makes it possible to use analysis frameworks that account for identity by descent or parent of origin of alleles and it can lead to a large increase in data quantities via genotype or sequence imputation. Long-range phasing and haplotype library imputation constitute a fast and accurate method to impute phase for SNP data. Methods A long-range phasing and haplotype library imputation algorithm was developed. It combines information from surrogate parents and long haplotypes to resolve phase in a manner that is not dependent on the family structure of a dataset or on the presence of pedigree information. Results The algorithm performed well in both simulated and real livestock and human datasets in terms of both phasing accuracy and computation efficiency. The percentage of alleles that could be phased in both simulated and real datasets of varying size generally exceeded 98% while the percentage of alleles incorrectly phased in simulated data was generally less than 0.5%. The accuracy of phasing was affected by dataset size, with lower accuracy for dataset sizes less than 1000, but was not affected by effective population size, family data structure, presence or absence of pedigree information, and SNP density. The method was computationally fast. In comparison to a commonly used statistical method (fastPHASE), the current method made about 8% less phasing mistakes and ran about 26 times faster for a small dataset. For larger datasets, the differences in computational time are expected to be even greater. A computer program implementing these methods has been made available. Conclusions The algorithm and software developed in this study make feasible the routine phasing of high-density SNP chips in large datasets.</p

Research UNE

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

The Genomic Signature of Crop-Wild Introgression in Maize

Author: A Badr
A Cornille
A Gallavotti
A Gusev
A Gutierrez
AD Twyford
AL Price
BM Baltazar
BR Grant
BS Weir
C Pardo-Diaz
CB Heiser
CJ Whipple
D Falush
D Reich
D Reich
DR Piperno
E Anderson
E Anderson
EJ Baack
EY Durand
FO Freitas
G Coop
GL Stebbins
GN Collins
H Wang
HB Xia
HG Wilkes
J Doebley
J Doebley
J Doebley
J Goudet
J Mallet
J Molina
J Ross-Ibarra
J van Heerwaarden
J-M Chia
Jeffrey Ross-Ibarra
JF Arnaud
JF Doebley
JG Rodriguez
JK Pritchard
JL Kermicle
JL Kermicle
JP Cook
K Fujino
K Fukunaga
K Thornton
K Zhao
KD Whitney
KD Whitney
KK Dasmahapatra
KL McNally
L Excoffier
LR Sanchez-Velasquez
M Currat
M Delplancke
M Heun
M Kim
M Kwak
M Lee
M Scascitelli
MA Saghai-Maroof
Matthew B. Hufford
MB Hufford
MB Hufford
MC Luo
Michael T. Devengenzo
ML Arnold
ML Arnold
ML Arnold
ML Warburton
MMS Evans
MW Ganal
N Barthakur
N Lauter
NC Ellstrand
NC Ellstrand
NH Barton
Norman C. Ellstrand
P Scheet
Pesach Lubinksy
PL Morrell
PL Morrell
PR Grant
RE Green
RJ Kulathinal
Rodney Mauricio
S Gavrilets
S Huebner
S Myles
SA Flint-Garcia
SC Kim
SP Moose
SR Whitt
Tanja Pyhäjärvi
Y Matsuoka
Y Sun
Y Vigouroux
Z He
Z Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/01/2013
Field of study

The evolutionary significance of hybridization and subsequent introgression has long been appreciated, but evaluation of the genome-wide effects of these phenomena has only recently become possible. Crop-wild study systems represent ideal opportunities to examine evolution through hybridization. For example, maize and the conspecific wild teosinte Zea mays ssp. mexicana, (hereafter, mexicana) are known to hybridize in the fields of highland Mexico. Despite widespread evidence of gene flow, maize and mexicana maintain distinct morphologies and have done so in sympatry for thousands of years. Neither the genomic extent nor the evolutionary importance of introgression between these taxa is understood. In this study we assessed patterns of genome-wide introgression based on 39,029 single nucleotide polymorphisms genotyped in 189 individuals from nine sympatric maize-mexicana populations and reference allopatric populations. While portions of the maize and mexicana genomes were particularly resistant to introgression (notably near known cross-incompatibility and domestication loci), we detected widespread evidence for introgression in both directions of gene flow. Through further characterization of these regions and preliminary growth chamber experiments, we found evidence suggestive of the incorporation of adaptive mexicana alleles into maize during its expansion to the highlands of central Mexico. In contrast, very little evidence was found for adaptive introgression from maize to mexicana. The methods we have applied here can be replicated widely, and such analyses have the potential to greatly informing our understanding of evolution through introgressive hybridization. Crop species, due to their exceptional genomic resources and frequent histories of spread into sympatry with relatives, should be particularly influential in these studies

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The Francis Crick Institute

A comparison of approaches to account for uncertainty in analysis of imputed genotypes

Author: Abecasis G.R.
Li Y.
Scheet P.
Zheng J.
Publication venue
Publication date: 01/01/2011
Field of study

The availability of extensively genotyped reference samples, such as "The HapMap" and 1,000 Genomes Project reference panels, together with advances in statistical methodology, have allowed for the imputation of genotypes at single nucleotide polymorphism (SNP) markers that are untyped in a cohort or case-control study. These imputation procedures facilitate the interpretation and meta-analyses of genome-wide association studies. A natural question when implementing these procedures concerns how best to take into account uncertainty in imputed genotypes. Here we compare the performance of the following three strategies: least-squares regression on the "best-guess" imputed genotype; regression on the expected genotype score or "dosage"; and mixture regression models that more fully incorporate posterior probabilities of genotypes at untyped SNPs. Using simulation, we considered a range of sample sizes, minor allele frequencies, and imputation accuracies to compare the performance of the different methods under various genetic models. The mixture models performed the best in the setting of a large genetic effect and low imputation accuracies. However, for most realistic settings, we find that regressing the phenotype on the estimated allelic or genotypic dosage provides an attractive compromise between accuracy and computational tractability

PubMed Central

Carolina Digital Repository

Genome-wide analysis of BMI in adolescents and young adults reveals additional insight into the effects of genetic loci over the life course

Author: Abecasis Goncalo R.
Amin Najaf
Atwood Larry D.
Berndt Sonja I.
Chanock Stephen J.
Choh Audrey C.
Crout Richard J.
Cupples L. Adrienne
Czerwinski Stefan A.
Demerath Ellen W.
Dyer Thomas D.
Edward Lakatta
Esko Tõnu
Fox Caroline S.
Francesco Cucca
Gordon-Larsen Penny
Graff Mariaelisa
Hayes Richard B.
Heard-Costa Nancy L.
Homuth Georg
Hu Frank
Jacobs Kevin B.
Marazita Mary L.
Metspalu Andres
Monda Keri
Nelis Mari
Ngwa Julius S.
Nikopensius Tit
North Kari E.
Oostra Ben A.
Qi Lu
Sanna Serena
Scheet Paul
Schipf Sabine
Schlessinger David
Shaffer John R.
Sidore Carlo
Strachan David P.
Teumer Alexander
Towne Bradford
Van Dam Rob M.
Van Duijn Cornelia M.
Völzke Henry
Wallaschofski Henri
Wang Zhaoming
White Charles
Workalemahu Tsegaselassie
Xiao Xiangjun
Zillikens M. Carola
Publication venue
Publication date: 02/08/2017
Field of study

Genetic loci for body mass index (BMI) in adolescence and young adulthood, a period of high risk for weight gain, are understudied, yet may yield important insight into the etiology of obesity and early intervention. To identify novel genetic loci and examine the influence of known loci on BMI during this critical time period in late adolescence and early adulthood, we performed a two-stage meta-analysis using 14 genome-wide association studies in populations of European ancestry with data on BMI between ages 16 and 25 in up to 29 880 individuals. We identified seven independent loci (P < 5.0 × 10−8) near FTO (P = 3.72 × 10−23), TMEM18 (P = 3.24 × 10−17), MC4R (P = 4.41 × 10−17), TNNI3K (P = 4.32 × 10−11), SEC16B (P = 6.24 × 10−9), GNPDA2 (P = 1.11 × 10−8) and POMC (P = 4.94 × 10−8) as well as a potential secondary signal at the POMC locus (rs2118404, P = 2.4 × 10−5 after conditioning on the established single-nucleotide polymorphism at this locus) in adolescents and young adults. To evaluate the impact of the established genetic loci on BMI at these young ages, we examined differences between the effect sizes of 32 published BMI loci in European adult populations (aged 18-90) and those observed in our adolescent and young adult meta-analysis. Four loci (near PRKD1, TNNI3K, SEC16B and CADM2) had larger effects and one locus (near SH2B1) had a smaller effect on BMI during adolescence and young adulthood compared with older adults (P < 0.05). These results suggest that genetic loci for BMI can vary in their effects across the life course, underlying the importance of evaluating BMI at different age

RERO DOC Digital Library

Rapid haplotype inference for nuclear families

Author: A Kong
A Kong
A Kong
A Kong
AL Williams
AM Andrés
Amy L Williams
BL Browning
BN Howie
David E Housman
David K Gifford
DF Gudbjartsson
DF Gudbjartsson
ES Lander
G Coop
G Gao
GR Abecasis
J Gayán
J Li
J Li
J Marchini
JE Wigginton
JR O'Connell
K Doi
K Markianos
L Kruglyak
L Kruglyak
M Fishelson
M Fujita
M Stephens
Martin C Rinard
P Scheet
PC Sabeti
S Lin
S Lin
SR Browning
T Niu
T Niu
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Hapi is a new dynamic programming algorithm that ignores uninformative states and state transitions in order to efficiently compute minimum-recombinant and maximum likelihood haplotypes. When applied to a dataset containing 103 families, Hapi performs 3.8 and 320 times faster than state-of-the-art algorithms. Because Hapi infers both minimum-recombinant and maximum likelihood haplotypes and applies to related individuals, the haplotypes it infers are highly accurate over extended genomic distances.National Institutes of Health (U.S.) (NIH grant 5-T90-DK070069)National Institutes of Health (U.S.) (Grant 5-P01-NS055923)National Science Foundation (U.S.) (Graduate Research Fellowship

CiteSeerX

DSpace@MIT

Crossref

Springer - Publisher Connector

PubMed Central