46 research outputs found

    Draft genome sequence of Sclerospora graminicola, the pearl millet downy mildew pathogen:Genome sequence of pearl millet downy mildew pathogen

    Get PDF
    Sclerospora graminicola pathogen is one of the most important biotic production constraints of pearl millet worldwide. We report a de novo whole genome assembly and analysis of pathotype 1. The draft genome assembly contained 299,901,251 bp with 65,404 genes. Pearl millet [Pennisetum glaucum (L.) R. Br.], is an important crop of the semi-arid and arid regions of the world. It is capable of growing in harsh and marginal environments with highest degree of tolerance to drought and heat among cereals (1). Downy mildew is the most devastating disease of pearl millet caused by Sclerospora graminicola (sacc. Schroet), particularly on genetically uniform hybrids. Estimated annual grain yield loss due to downy mildew is approximately 10?80 % (2-7). Pathotype 1 has been reported to be the highly virulent pathotype of Sclerospora graminicola in India (8). We report a de novo whole genome assembly and analysis of Sclerospora graminicola pathotype 1 from India. A susceptible pearl millet genotype Tift 23D2B1P1-P5 was used for obtaining single-zoospore isolates from the original oosporic sample. The library for whole genome sequencing was prepared according to the instructions by NEB ultra DNA library kit for Illumina (New England Biolabs, USA). The libraries were normalised, pooled and sequenced on Illumina HiSeq 2500 (Illumina Inc., San Diego, CA, USA) platform at 2 x100 bp length. Mate pair (MP) libraries were prepared using the Nextera mate pair library preparation kit (Illumina Inc., USA). 1 ?g of Genomic DNA was subject to tagmentation and was followed by strand displacement. Size selection tagmented/strand displaced DNA was carried out using AmpureXP beads. The libraries were validated using an Agilent Bioanalyser using DNA HS chip. The libraries were normalised, pooled and sequenced on Illumina MiSeq (Illumina Inc., USA) platform at 2 x300 bp length. The whole genome sequencing was performed by sequencing of 7.38 Gb with 73,889,924 paired end reads from paired end library, and 1.15 Gb with 3,851,788 reads from mate pair library generated from Illumina HiSeq2500 and Illumina MiSeq, respectively. The sequences were assembled using various assemblers like ABySS, MaSuRCA, Velvet, SOAPdenovo2, and ALLPATHS-LG. The assembly generated by MaSuRCA (9) algorithm was observed superior over other algorithms and hence used for scaffolding using SSPACE. Assembled draft genome sequence of S. graminicola pathotype 1 was 299,901,251 bp long, with a 47.2 % GC content consisting of 26,786 scaffolds with N50 of 17,909 bp with longest scaffold size of 238,843 bp. The overall coverage was 40X. The draft genome sequence was used for gene prediction using AUGUSTUS. The completeness of the assembly was investigated using CEGMA and revealed 92.74% proteins completely present and 95.56% proteins partially present, while BUSCO fungal dataset indicated 64.9% complete, 12.4% fragmented, 22.7% missing out of 290 BUSCO groups. A total of 52,285 predicted genes were annotated using BLASTX and 38,120 genes were observed with significant BLASTX match. Repetitive element analysis in the assembly revealed 8,196 simple repeats, 1,058 low complexity repeats and 5,562 dinucleotide to hexanucleotide microsatellite repeats.publishersversionPeer reviewe

    Whole Genome Sequencing and Comparative Genomic Analysis Reveal Allelic Variations Unique to a Purple Colored Rice Landrace (Oryza sativa ssp. indica cv. Purpleputtu)

    Get PDF
    Purpleputtu (Oryza sativa ssp. indica cv. Purpleputtu) is a unique rice landrace from southern India that exhibits predominantly purple color. This study reports the underlying genetic complexity of the trait, associated domestication and de-domestication processes during its coevolution with present day cultivars. Along-with genome level allelic variations in the entire gene repertoire associated with the purple, red coloration of grain and other plant parts. Comparative genomic analysis using ‘a panel of 108 rice lines’ revealed a total of 3,200,951 variants including 67,774 unique variations in Purpleputtu (PP) genome. Multiple sequence alignment uncovered a 14 bp deletion in Rc (Red colored, a transcription factor of bHLH class) locus of PP, a key regulatory gene of anthocyanin biosynthetic pathway. Interestingly, this deletion in Rc gene is a characteristic feature of the present-day white pericarped rice cultivars. Phylogenetic analysis of Rc locus revealed a distinct clade showing proximity to the progenitor species Oryza rufipogon and O. nivara. In addition, PP genome exhibits a well conserved 4.5 Mbp region on chromosome 5 that harbors several loci associated with domestication of rice. Further, PP showed 1,387 unique when SNPs compared to 3,023 lines of rice (SNP-Seek database). The results indicate that PP genome is rich in allelic diversity and can serve as an excellent resource for rice breeding for a variety of agronomically important traits such as disease resistance, enhanced nutritional values, stress tolerance, and protection from harmful UV-B rays

    Genome Assembly and Annotation for Cymbopogon citratus

    No full text
    Lemon grass (Cymbopogon citratus L.) is a member of the Poaceae family and is famous for its culinary, cultural, cosmetic, and medicinal properties. Therefore, the present study aims to assemble the genome of the lemon grass and provide a valuable resource for mining biochemical pathways. The raw genome data is retrieved from NCBI and cleaned with AdapterRemoval version 2.3.2 for high-quality clean data. The genome size was estimated using Jellyfish 2.2.10 and GenomeScope version 1.0. MaSurCa version 3.3.2. was used to generate genome assembly. BUSCO version 4.1.2. is used to assess the completeness and quality of genome assembly. This analysis resulted in a draft nuclear genome of 364,442,032 bps with 127,303 scaffolds. RepeatModeler version 2.0.1., AUGUSTUS version 3.3.2., and tRNAscan- SE version 2.0.6. are used to identify repeats, genes, and tRNA genes, respectively. This study identified 41.66 % repeats,  41,775 genes, and 681 tRNAs. UniProt protein database, OrthoFinder version 2.2.7, InterproScan, Plant metabolic network (PMN) analysis, and Gene Ontology (G.O.) categorization annotate the genome functionally. GetOrganelle version 1.6.4., is used to generate the mitochondrial and chloroplast genome assembly of 367,579 bps and 139,690 bps. CPGAVAS2 version 1 and AGORA version 1 annotate the genome of chloroplast and mitochondria, respectively. The genes and pathways (photosynthesis, glycolysis, pyruvate, terpenoid backbone synthesis, and TCA cycle) associated with essential oil production are identified and mapped. Thus, this study reports the draft nuclear and organelle genome assembly; and genes and pathways participating in the biosynthesis of essential oil production in C. citratus L

    Genome Assembly and Annotation for Phoenix roebelenii

    No full text
    The field of ornamental plant genomics has witnessed an increase in sequencing of whole genome of ornamental plants in the last ten years. Phoenix roebelenii (pygmy date palm) is popular ornamental plant which is grown both indoor and outdoor. Pygmy date palm is a tropical and subtropical plant which belongs to the family Arecaceae. This plant is resistant to pests, tolerant to soil variation, and drought tolerant. Therefore, it is of interest to report the complete nuclear genome and organelle sequences of Phoenix roebelenii. The draft nuclear genome sequence constitutes 462,152,837 bps with 7,019 scaffolds. In total, 35.11% of repeats, 42,388 genes and 480 tRNA genes were predicted. The organelle genome sequences - chloroplast genome sequence and mitochondrial genome sequence were also reported in the study. The chloroplast genome consists of 125,222 bps and the mitochondrial genome consists of 482,735 bps

    Gene associated SNP discovery in fine quality Indian Gossypium barbadense cotton through whole genome resequencing

    No full text
    262-272The present study deals with the resequencing of cultivars, Suvin and BCS-23-18-7 belonging to Gossypium barbadense which are distinctly differing in fiber qualities and seed cotton yield using Illumina Hiseq2500 sequencer. A total of 2,604,107 single nucleotide polymorphism (SNP) and 592,364 insertion and deletions (INDELs) were identified when compared with G. barbadense reference genome. Among the 14,075 preferentially expressed genes of agronomically important traits of cotton, we have identified 33,637 markers in the genic regions. Comparing between the variants of Suvin and BCS-23-18-7, among the 4,929 preferentially expressed genes in fiber, only 1,128 genes have 2,453 variants and of these 1,512 variants are non-synonymous types, leading to change at the protein level. In order to validate the presence of these markers in the expressed genes could tag the expression and/or phenotypic variation bought by alleles of Suvin/BCS- 23-18-7, among 51 fiber elongation genes, ten genes that had high effect SNPs were utilized for real-time PCR, which showed an extended period of expression up to 15 days post-anthesis (DPA) in Suvin. Hence, utilization of such markers for the construction of SNP array / linkage maps would provide greater value to quantitative trait loci (QTL) mapping instead of random genomic markers

    Identification of heat responsive genes in pea stipules and anthers through transcriptional profiling

    Full text link
    AbstractField pea (Pisum sativum L.), a cool-season legume crop, is known for poor heat tolerance. Our previous work identified PR11-2 and PR11-90 as heat tolerant and susceptible lines in a recombinant inbred population. CDC Amarillo, a Canadian elite pea variety, was considered as another heat tolerant variety based on its similar field performance as PR11-2. This study aimed to characterize the differential transcription. Plants of these three varieties were stressed for 3h at 38°C prior to self-pollination, and RNAs from heat stressed anthers and stipules on the same flowering node were extracted and sequenced via the Illumina NovaSeq platform for the characterization of heat responsive genes. In silico results were further validated by qPCR assay. Differentially expressed genes (DEGs) were identified at log2 fold change, the three varieties shared 588 DEGs which were up-regulated and 220 genes which were down-regulated in anthers when subjected to heat treatment. In stipules, 879 DEGs (463/416 upregulation/downregulation) were consistent among varieties. The above heat-induced genes of the two plant organs were related to several biological processes i.e., response to heat, protein folding and DNA templated transcription. Ten gene ontology (GO) terms were over-represented in the consistently down-regulated DEGs of the two organs, and these terms were mainly related to cell wall macromolecule metabolism, lipid transport, lipid localization, and lipid metabolic processes. GO enrichment analysis on distinct DEGs of individual pea varieties suggested that heat affected biological processes were dynamic, and variety distinct responses provide insight into molecular mechanisms of heat-tolerance response. Several biological processes, e.g., cellular response to DNA damage stimulus in stipule, electron transport chain in anther that were only observed in heat induced PR11-2 and CDC Amarillo, and their relevance to field pea heat tolerance is worth further validation.</jats:p

    Identification of heat responsive genes in pea stipules and anthers through transcriptional profiling

    No full text
    Field pea (Pisum sativum L.), a cool-season legume crop, is known for poor heat tolerance. Our previous work identified PR11-2 and PR11-90 as heat tolerant and susceptible lines in a recombinant inbred population. CDC Amarillo, a Canadian elite pea variety, was considered as another heat tolerant variety based on its similar field performance as PR11-2. This study aimed to characterize the differential transcription. Plants of these three varieties were stressed for 3 h at 38°C prior to self-pollination, and RNAs from heat stressed anthers and stipules on the same flowering node were extracted and sequenced via the Illumina NovaSeq platform for the characterization of heat responsive genes. In silico results were further validated by qPCR assay. Differentially expressed genes (DEGs) were identified at log2 |fold change (FC)| ≥ 2 between high temperature and control temperature, the three varieties shared 588 DEGs which were up-regulated and 220 genes which were down-regulated in anthers when subjected to heat treatment. In stipules, 879 DEGs (463/416 upregulation/downregulation) were consistent among varieties. The above heat-induced genes of the two plant organs were related to several biological processes i.e., response to heat, protein folding and DNA templated transcription. Ten gene ontology (GO) terms were over-represented in the consistently down-regulated DEGs of the two organs, and these terms were mainly related to cell wall macromolecule metabolism, lipid transport, lipid localization, and lipid metabolic processes. GO enrichment analysis on distinct DEGs of individual pea varieties suggested that heat affected biological processes were dynamic, and variety distinct responses provide insight into molecular mechanisms of heat-tolerance response. Several biological processes, e.g., cellular response to DNA damage stimulus in stipule, electron transport chain in anther that were only observed in heat induced PR11-2 and CDC Amarillo, and their relevance to field pea heat tolerance is worth further validation.</jats:p
    corecore