Search CORE

Edinburgh Research Explorer

Sussex Research Online

Linked read technology for assembling large complex and polyploid genomes

Author: A Akintayo
A Akintayo
A Balu
A Salman-Minkov
Alina Ott
B Nystedt
C Del Fabbro
C Feuillet
C Liu
C Rao
Chao Liu
Cheng-Ting Yeh
Clifton L. Dalgard
CS Chin
DM Altshuler
DR Bentley
E Lieberman-Aiden
E Lyons
E Lyons
GXY Zheng
H Li
H Tang
HB Tang
Heng-Cheng Hu
HV Hunt
James C. Schnable
JL Bennetzen
JR MacDonald
JS Seo
L Coombe
Linjiang Wu
LJ Briggs
M Freeling
M Kubesova
MA Hamoud
ME Rasekh
MW Crepeau
MW Libbrecht
N Rodic
N Spies
NI Weisenfeld
P SanMiguel
Patrick S. Schnable
PS Schnable
RK Saxena
RS Baucom
RS Li
S Goodwin
S Renny-Byfield
S Sarkar
S Sarkar
SJ Emrich
SJ Emrich
SM Utturkar
Soumik Sarkar
TJ Treangen
Y Fu
Y Mostovoy
YN Jiao
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2018
Field of study

Background: Short read DNA sequencing technologies have revolutionized genome assembly by providing high accuracy and throughput data at low cost. But it remains challenging to assemble short read data, particularly for large, complex and polyploid genomes. The linked read strategy has the potential to enhance the value of short reads for genome assembly because all reads originating from a single long molecule of DNA share a common barcode. However, the majority of studies to date that have employed linked reads were focused on human haplotype phasing and genome assembly. Results: Here we describe a de novo maize B73 genome assembly generated via linked read technology which contains ~ 172,000 scaffolds with an N50 of 89 kb that cover 50% of the genome. Based on comparisons to the B73 reference genome, 91% of linked read contigs are accurately assembled. Because it was possible to identify errors with \u3e 76% accuracy using machine learning, it may be possible to identify and potentially correct systematic errors. Complex polyploids represent one of the last grand challenges in genome assembly. Linked read technology was able to successfully resolve the two subgenomes of the recent allopolyploid, proso millet (Panicum miliaceum). Our assembly covers ~ 83% of the 1 Gb genome and consists of 30,819 scaffolds with an N50 of 912 kb. Conclusions: Our analysis provides a framework for future de novo genome assemblies using linked reads, and we suggest computational strategies that if implemented have the potential to further improve linked read assemblies, particularly for repetitive genomes

DigitalCommons@University of Nebraska

DEP and AFO Regulate Reproductive Habit in Rice

Author: AS Vega
C Ferrandiz
Ding Tang
DM Moore
E Ballesteros
EM Meyerowitz
F Tooke
FF Coelho
G Ditta
GK Agrawal
J Kyozuka
J Lim
JD Thompson
Jian Huang
JL Bowman
JL Bowman
JS Jeon
JS Jeon
K Ikeda
K Prasad
K Prasad
KD Gordon-Gray
KE Goebel
Kejian Wang
Lilan Hong
Ming Li
Minghong Gu
MM Kater
N Shitsukawa
NH Battey
P Bommert
Patrick S. Schnable
S Kumar
S Pelaz
S Pierce
SJ Milton
ST Malcomber
T Elmqvist
T Yamaguchi
Wenying Xu
Yongbiao Xue
Zhukuan Cheng
ZX Chen
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Sexual reproduction is essential for the life cycle of most angiosperms. However, pseudovivipary is an important reproductive strategy in some grasses. In this mode of reproduction, asexual propagules are produced in place of sexual reproductive structures. However, the molecular mechanism of pseudovivipary still remains a mystery. In this work, we found three naturally occurring mutants in rice, namely, phoenix (pho), degenerative palea (dep), and abnormal floral organs (afo). Genetic analysis of them indicated that the stable pseudovivipary mutant pho was a double mutant containing both a Mendelian mutation in DEP and a non-Mendelian mutation in AFO. Further map-based cloning and microarray analysis revealed that dep mutant was caused by a genetic alteration in OsMADS15 while afo was caused by an epigenetic mutation in OsMADS1. Thus, OsMADS1 and OsMADS15 are both required to ensure sexual reproduction in rice and mutations of them lead to the switch of reproductive habit from sexual to asexual in rice. For the first time, our results reveal two regulators for sexual and asexual reproduction modes in flowering plants. In addition, our findings also make it possible to manipulate the reproductive strategy of plants, at least in rice

CiteSeerX

A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6

Abstract Background The fermented dried seeds of <it>Theobroma cacao </it>(cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust <it>T. cacao </it>cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. Results Here, we describe the construction of a BAC-based integrated genetic-physical map of the <it>T. cacao </it>cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of <it>T. cacao </it>is 374.6 Mbp. A comparative analysis with <it>A. thaliana, V. vinifera</it>, and <it>P. trichocarpa </it>suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two <it>T. cacao </it>cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. Conclusions The results presented in this study are a stand-alone resource for functional exploitation and enhancement of <it>Theobroma cacao </it>but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the <it>T. cacao </it>genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.</p

Springer - Publisher Connector

Clemson Open (Clemson University)

Using Microsatellites to Understand the Physical Distribution of Recombination on Soybean Chromosomes

Author: A Barakat
Alina Ott
Brian Trautschold
C Saintenac
D Sandhu
Devinder Sandhu
ED Akhunov
ED Akhunov
F Wei
G Kunzel
H Hisano
IY Choi
J Drouaud
J Mudge
J Schmutz
JG Walling
JGK Williams
JS Mutti
K Arumuganathan
K Mullis
K Nagaki
LK Anderson
M Chen
M Erayman
M Morgante
MA Gore
MS Akkaya
N Carels
N Yamanaka
P Keim
P Vos
PB Cregan
PS Schnable
Pär K. Ingvarsson
Q Yu
QJ Song
QJ Song
R Apuyan
R Bernatzky
RE Voorrips
S Liu
TY Hwang
YL Zhu
Z Xia
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Soybean is a major crop that is an important source of oil and proteins. A number of genetic linkage maps have been developed in soybean. Specifically, hundreds of simple sequence repeat (SSR) markers have been developed and mapped. Recent sequencing of the soybean genome resulted in the generation of vast amounts of genetic information. The objectives of this investigation were to use SSR markers in developing a connection between genetic and physical maps and to determine the physical distribution of recombination on soybean chromosomes. A total of 2,188 SSRs were used for sequence-based physical localization on soybean chromosomes. Linkage information was used from different maps to create an integrated genetic map. Comparison of the integrated genetic linkage maps and sequence based physical maps revealed that the distal 25% of each chromosome was the most marker-dense, containing an average of 47.4% of the SSR markers and 50.2% of the genes. The proximal 25% of each chromosome contained only 7.4% of the markers and 6.7% of the genes. At the whole genome level, the marker density and gene density showed a high correlation (R2) of 0.64 and 0.83, respectively with the physical distance from the centromere. Recombination followed a similar pattern with comparisons indicating that recombination is high in telomeric regions, though the correlation between crossover frequency and distance from the centromeres is low (R2 = 0.21). Most of the centromeric regions were low in recombination. The crossover frequency for the entire soybean genome was 7.2%, with extremes much higher and lower than average. The number of recombination hotspots varied from 1 to 12 per chromosome. A high correlation of 0.83 between the distribution of SSR markers and genes suggested close association of SSRs with genes. The knowledge of distribution of recombination on chromosomes may be applied in characterizing and targeting genes

Maize (Zea mays L.) Genome Diversity as Revealed by RNA-Sequencing

Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis

Digital Repository @ Iowa State University (ISU)

Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content

Author: A Ching
A Kato
A Ronchi
A. Leonardo Iniguez
AB Olshen
AJ Sharp
AP Dempster
AP Hsia
AS Lee
B McClintock
BS Everitt
C Workman
CB Della Vedova
Cheng-Ting Yeh
DA Laurie
Dan Nettleton
E Buckler
EL Walker
ES Venkatraman
F Tian
GH Perry
GK Smyth
GK Smyth
GM Cooper
H Fu
H Yao
Heidi Rosenbaum
I Vroh Bi
J Doebley
J Lai
J Messing
J Sebat
Jacob Kitzman
JD Storey
Jeffrey A. Jeddeloh
JF Doebley
JM Kidd
Joseph R. Ecker
JS Beckmann
K Ohtsu
KA Frazer
KA Palaisa
Kai Ying
L Feuk
M Golubovsky
M Morgante
M Stam
M Yamasaki
MD Yandeau-Nelson
ME Hurles
MI Tenaillon
Nathan M. Springer
NM Springer
Patrick S. Schnable
PS Schnable
Q Wang
R Pilu
R Redon
R Song
RA Swanson-Wagner
RA Swanson-Wagner
RA Welch
RM Stupar
RR Selzer
S Brunner
S Liu
SA Flint-Garcia
SB Cannon
SI Wright
SJ Emrich
SM Adawy
SM Smith
SP Moose
SW Scherer
TA Graubert
Tieming Ji
TJ Albert
TK Wolfgruber
Todd Richmond
V Guryev
W. Brad Barbazuk
WB Barbazuk
Wei Wu
WK Chen
WL Brown
Y Fu
Yan Fu
Yi Jia
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

FIDEL—a retrovirus-like retrotransposon and its distinct evolutionary histories in the A- and B-genome components of cultivated peanut

Author: A Brandes
A Kumar
AP Fávero
B SanMiguel
B Yüksel
BD Schrire
C Cheng
C Feschotte
C Snider
C Vitte
Christopher Town
CM Vicient
D Armisén
D Bertioli
D Fonceka
D Grattapaglia
DA Wright
DA Wright
David Bertioli
E Fukai
EM Temsch
EM Temsch
F Chavanne
F Sabot
F Sabot
Fernando Campos-Fonseca
G Kochert
G Seijo
Guillermo Seijo
H Ohtsubo
HM Laten
J Greilhuber
J Maluszynska
JD Thompson
JG Seijo
JL Bennetzen
JR Wortman
JS Hawkins
JS Heslop-Harrison
JX Ma
JY Lin
K Alix
K Kashkush
K Shirasu
K Tamura
KM Devos
KP Singh
LH Madsen
MD Bennett
MD Bennett
MD Burow
P Fransz
P Neumann
P Neumann
P Neumann
P SanMiguel
Patricia Guimarães
PM Guimarães
PS Schnable
R Kalendar
R Staden
RO Hammons
Roberto Arrial
RW Michelmore
S Nielen
S Tabata
SF Altschul
SN Raina
Soraya Leal-Bertioli
SR Pearce
ST Yano
Stephan Nielen
T Pélissier
T Schwarzacher
TH Jukes
WL Gerlach
WL Gerlach
X Huang
XP Zhao
XY Zhang
Y Xiong
YL Orlov
ZL Liu
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

In this paper, we describe a Ty3-gypsy retrotransposon from allotetraploid peanut (Arachis hypogaea) and its putative diploid ancestors Arachis duranensis (A-genome) and Arachis ipaënsis (B-genome). The consensus sequence is 11,223 bp. The element, named FIDEL (Fairly long Inter-Dispersed Euchromatic LTR retrotransposon), is more frequent in the A- than in the B-genome, with copy numbers of about 3,000 (±950, A. duranensis), 820 (±480, A. ipaënsis), and 3,900 (±1,500, A. hypogaea) per haploid genome. Phylogenetic analysis of reverse transcriptase sequences showed distinct evolution of FIDEL in the ancestor species. Fluorescent in situ hybridization revealed disperse distribution in euchromatin and absence from centromeres, telomeric regions, and the nucleolar organizer region. Using paired sequences from bacterial artificial chromosomes, we showed that elements appear less likely to insert near conserved ancestral genes than near the fast evolving disease resistance gene homologs. Within the Ty3-gypsy elements, FIDEL is most closely related with the Athila/Calypso group of retrovirus-like retrotransposons. Putative transmembrane domains were identified, supporting the presence of a vestigial envelope gene. The results emphasize the importance of FIDEL in the evolution and divergence of different Arachis genomes and also may serve as an example of the role of retrotransposons in the evolution of legume genomes in general

Springer - Publisher Connector

CONICET Digital

Exceptional Diversity, Non-Random Distribution, and Rapid Evolution of Retroelements in the B73 Maize Genome

Author: AFA Smit
Ansuya Jogi
AP Tikhonov
B McClintock
B Piegu
BA Kronmiller
BC Meyers
BJM Zonneveld
C Vitte
Cristian Chaparro
DA Kramerov
DC Howell
DE Berg
EM McCarthy
H Chou
HA Schmidt
Harmit S. Malik
HH Fu
J Felsenstein
James C. Estill
JC Estill
JC Estill
JD Thompson
Jean-Marc Deragon
Jeffrey L. Bennetzen
JL Bennetzen
JL Bennetzen
JL Bennetzen
JL Bennetzen
JM Deragon
JS Hawkins
JX Ma
JX Ma
JX Ma
K Fengler
KJ Edwards
KM Devos
M Umeda
M Yamazaki
MA Grandbastien
MA Johns
MJ Varagona
MV Mendiola
N Galtier
Naadira Upshaw
O Jaillon
P SanMiguel
P SanMiguel
P SanMiguel
Phillip J. SanMiguel
PJ SanMiguel
PJ SanMiguel
PL Deininger
PRJ Leeton
PS Schnable
RC Edgar
Regina S. Baucom
Richard P. Westerman
RS Baucom
RY Liu
S Brunner
S Tsukahara
SB Hedges
SR Wessler
T Wicker
TE Bureau
VV Kapitonov
W Gilbert
W Wang
XW Gai
Y Yasui
Y Yoshioka
YK Jin
Z Yang
Publication venue: Public Library of Science
Publication date: 01/11/2009
Field of study

Recent comprehensive sequence analysis of the maize genome now permits detailed discovery and description of all transposable elements (TEs) in this complex nuclear environment. Reiteratively optimized structural and homology criteria were used in the computer-assisted search for retroelements, TEs that transpose by reverse transcription of an RNA intermediate, with the final results verified by manual inspection. Retroelements were found to occupy the majority (>75%) of the nuclear genome in maize inbred B73. Unprecedented genetic diversity was discovered in the long terminal repeat (LTR) retrotransposon class of retroelements, with >400 families (>350 newly discovered) contributing >31,000 intact elements. The two other classes of retroelements, SINEs (four families) and LINEs (at least 30 families), were observed to contribute 1,991 and ∼35,000 copies, respectively, or a combined ∼1% of the B73 nuclear genome. With regard to fully intact elements, median copy numbers for all retroelement families in maize was 2 because >250 LTR retrotransposon families contained only one or two intact members that could be detected in the B73 draft sequence. The majority, perhaps all, of the investigated retroelement families exhibited non-random dispersal across the maize genome, with LINEs, SINEs, and many low-copy-number LTR retrotransposons exhibiting a bias for accumulation in gene-rich regions. In contrast, most (but not all) medium- and high-copy-number LTR retrotransposons were found to preferentially accumulate in gene-poor regions like pericentromeric heterochromatin, while a few high-copy-number families exhibited the opposite bias. Regions of the genome with the highest LTR retrotransposon density contained the lowest LTR retrotransposon diversity. These results indicate that the maize genome provides a great number of different niches for the survival and procreation of a great variety of retroelements that have evolved to differentially occupy and exploit this genomic diversity

HAL: Hyper Article en Ligne

Purdue E-Pubs

A Highly Conserved, Small LTR Retrotransposon that Preferentially Targets Genes in Grass Genomes

Author: A Girard
A Kumar
A Roulin
B Piegu
Blake C. Meyers
BS Gaut
C Feschotte
C Feschotte
C Vitte
C Vitte
C Zhang
CP Witte
D Gao
DG Higgins
Dongying Gao
ER Havecker
G Yang
GG Presting
H Hirochika
H Ito
H Yan
HH Kazazian Jr
HS Malik
I Marín
J Fernandes
J Liu
J Ma
JA Bedell
JF Wendel
Jinfeng Chen
JS Ammiraju
K Naito
K Shirasu
K Tamura
KJ Livak
KM Devos
L Bai
M Charles
M Mirouze
MA German
Mark A. Batzer
Mingsheng Chen
N Jiang
N Sugawara
P Neumann
P SanMiguel
PH Maxwell
PS Schnable
R Cordaux
R Kalendar
R Kalendar
S Kikuchi
S Ouyang
S Tsukahara
Scott Jackson
SD Ehrlich
SI Grewal
SM Berget
SR Wessler
SR Wessler
SY Ying
T Pélissier
T Sang
T Tanaka
T Wicker
TE Bureau
TE Bureau
VD Soleimani
W Li
X Diao
Y Ding
Y Xiong
Z Cheng
Z Lippman
Z Xu
Publication venue: Public Library of Science
Publication date: 16/02/2012
Field of study

LTR retrotransposons are often the most abundant components of plant genomes and can impact gene and genome evolution. Most reported LTR retrotransposons are large elements (>4 kb) and are most often found in heterochromatic (gene poor) regions. We report the smallest LTR retrotransposon found to date, only 292 bp. The element is found in rice, maize, sorghum and other grass genomes, which indicates that it was present in the ancestor of grass species, at least 50–80 MYA. Estimated insertion times, comparisons between sequenced rice lines, and mRNA data indicate that this element may still be active in some genomes. Unlike other LTR retrotransposons, the small LTR retrotransposons (SMARTs) are distributed throughout the genomes and are often located within or near genes with insertion patterns similar to MITEs (miniature inverted repeat transposable elements). Our data suggests that insertions of SMARTs into or near genes can, in a few instances, alter both gene structures and gene expression. Further evidence for a role in regulating gene expression, SMART-specific small RNAs (sRNAs) were identified that may be involved in gene regulation. Thus, SMARTs may have played an important role in genome evolution and genic innovation and may provide a valuable tool for gene tagging systems in grass