9 research outputs found
AltTrans: Transcript pattern variants annotated for both alternative splicing and alternative polyadenylation
BACKGROUND: The three major mechanisms that regulate transcript formation involve the selection of alternative sites for transcription start (TS), splicing, and polyadenylation. Currently there are efforts that collect data & annotation individually for each of these variants. It is important to take an integrated view of these data sets and to derive a data set of alternate transcripts along with consolidated annotation. We have been developing in the past computational pipelines that generate value-added data at genome-scale on individual variant types; these include AltSplice on splicing and AltPAS on polyadenylation. We now extend these pipelines and integrate the resultant data sets to facilitate an integrated view of the contributions from splicing and polyadenylation in the formation of transcript variants. DESCRIPTION: The AltSplice pipeline examines gene-transcript alignments and delineates alternative splice events and splice patterns; this pipeline is extended as AltTrans to delineate isoform transcript patterns for each of which both introns/exons and 'terminating' polyA site are delineated; EST/mRNA sequences that qualify the transcript pattern confirm both the underlying splicing and polyadenylation. The AltPAS pipeline examines gene-transcript alignments and delineates all potential polyA sites irrespective of underlying splicing patterns. Resultant polyA sites from both AltTrans and AltPAS are merged. The generated database reports data on alternative splicing, alternative polyadenylation and the resultant alternate transcript patterns; the basal data is annotated for various biological features. The data (named as integrated AltTrans data) generated for both the organisms of human and mouse is made available through the Alternate Transcript Diversity web site at . CONCLUSION: The reported data set presents alternate transcript patterns that are annotated for both alternative splicing and alternative polyadenylation. Results based on current transcriptome data indicate that the contribution of alternative splicing is larger than that of alternative polyadenylation
Unravelling the genome of Holy basil: an “incomparable” “elixir of life” of traditional Indian medicine
AltTrans: Transcript pattern variants annotated for both alternative splicing and alternative polyadenylation
Abstract Background The three major mechanisms that regulate transcript formation involve the selection of alternative sites for transcription start (TS), splicing, and polyadenylation. Currently there are efforts that collect data & annotation individually for each of these variants. It is important to take an integrated view of these data sets and to derive a data set of alternate transcripts along with consolidated annotation. We have been developing in the past computational pipelines that generate value-added data at genome-scale on individual variant types; these include AltSplice on splicing and AltPAS on polyadenylation. We now extend these pipelines and integrate the resultant data sets to facilitate an integrated view of the contributions from splicing and polyadenylation in the formation of transcript variants. Description The AltSplice pipeline examines gene-transcript alignments and delineates alternative splice events and splice patterns; this pipeline is extended as AltTrans to delineate isoform transcript patterns for each of which both introns/exons and 'terminating' polyA site are delineated; EST/mRNA sequences that qualify the transcript pattern confirm both the underlying splicing and polyadenylation. The AltPAS pipeline examines gene-transcript alignments and delineates all potential polyA sites irrespective of underlying splicing patterns. Resultant polyA sites from both AltTrans and AltPAS are merged. The generated database reports data on alternative splicing, alternative polyadenylation and the resultant alternate transcript patterns; the basal data is annotated for various biological features. The data (named as integrated AltTrans data) generated for both the organisms of human and mouse is made available through the Alternate Transcript Diversity web site at http://www.ebi.ac.uk/atd/. Conclusion The reported data set presents alternate transcript patterns that are annotated for both alternative splicing and alternative polyadenylation. Results based on current transcriptome data indicate that the contribution of alternative splicing is larger than that of alternative polyadenylation.</p
Ancient Migrations - The first complete genome assembly, annotation and variants of the Zoroastrian-Parsi community of India
AbstractWith the advent of Next Generation Sequencing, many population specific whole genome sequences published thus far, predominantly represent individuals of European ancestry. While sequencing efforts of underrepresented communities in genomes datasets, like the Yoruba West-African, Han Chinese, Tibetan, South Korean, Egyptian and Japanese have recently added to the public genomic repositories, a comprehensive understanding of human genomic diversity and discovery of trait-associated variants necessitates the need for additional population specific analysis. In this context, the genomics of the population from the Indian sub-continent, given its genetic heterogeneity needs further elucidation.In this context, the endogamous Zoroastrian-Parsi community of India, offer an exceptional insight into a homogenous population that has culturally, socially, and genetically remained intact, for 13 centuries amidst the genomic, social and cultural Indian landscape, consequent to their migration from the ancient Persian plateau.Notwithstanding longevity as a trait, this endangered community is highly susceptible to cancers, rare genetic disorders, and display a documented high incidence of neurodegenerative and autoimmune conditions. The community as a matter of cultural practice abstains from smoking.Here, we describe the assembly and annotation of the genome of an adult female, Zoroastrian-Parsi individual sequenced at a high depth of 173X using a combination of short Illumina reads (160X) and long nanopore reads (13X). Using a combination of hybrid assemblers, we created a new, population-specific human reference genome, The Zoroastrian-Parsi Genome Reference Female, AGENOME-ZPGRF, contains 2,778,216,114 nucleotides as compared to 3,096,649,726 in GRCh38 constituting 93.235% of the total genomic fraction. Annotation identified 20833 genomic features, of which 14996 are almost identical to their counterparts on GRCh38 while 5837 genomic features were covered in partial. AGENOME-ZPGRF contained 5,426,310 variants of which the majority were SNP’s (4,291,601) and 960,867 SNPs were AGENOME-ZPGRF specific personal variants not listed in dbSNP.We present, AGENOME-ZPGRF as a whole reference for any genetic studies involving Zoroastrian-Parsi individuals extending their application to identify disease relevant prognostic biomarkers and variants in global population genomics studies.</jats:p
The First Complete Zoroastrian-Parsi Mitochondrial Reference Genome and genetic signatures of an endogamous non-smoking population
AbstractThe present-day Zoroastrian-Parsis have roots in ancient pastoralist migrations from circumpolar regions leading to their settlement on the Eurasian Steppes and later, as Indo-Iranians in the Fertile Crescent. After migrating from the Persian province of Pars to India, the Zoroastrians from Pars (“Parsis”) practiced endogamy, thereby preserving their genetic identity and social practices. The study was undertaken to gain an insight into the genetic consequences of migration on the community, the practice of endogamy, to decipher the phylogenetic relationships with other groups, and elucidate the disease linkages to their individual haplotypesWe generated the de novo the Zoroastrian-Parsi Mitochondrial Reference Genome (AGENOME-ZPMS-HV2a-1), which is the first complete mitochondrial reference genome assembled for this group. Phylogenetic analysis of an additional 99 Parsi mitochondrial genome sequences showed the presence of HV, U, T, A and F (belonging to the macrohaplogroup N) and Z and other M descendents of the macrohaplogroup M (M5, M39, M33, M44’52, M24, M3, M30, M2, M4’30, M2, M35 and M27) and a largely Persian origin for the Parsi community. We assembled individual reference genomes for each major haplogroup and the Zoroastrian-Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V1.0), which is the first consensus genome assembled for this group. We report the existence of 420 mitochondrial genetic variants, including 12 unique variants, in the 100 Zoroastrian-Parsi mitochondrial genome sequences. Disease association mapping showed 217 unique variants linked to longevity and 41 longevity-associated disease phenotypes across the majority of haplogroups.Analysis of the coding genes, tRNA genes, and the D-loop region revealed haplogroup-specific disease associations for Parkinson’s disease, Alzheimer’s disease, cancers, and rare diseases. No known mutations linked to lung cancer were found in our study. Mutational signatures linked to tobacco carcinogens, specifically, the C>A and G>T transitions, were observed at extremely low frequencies in the Parsi cohort, suggestive of an association between the cultural norm prohibiting smoking and its reflection in the genetic signatures. In sum, the Parsi mitochondrial genome provides an exceptional resource for determining details of their migration and uncovering novel genetic signatures for wellness and disease.</jats:p
The first complete Zoroastrian-Parsi mitochondrial reference genome and genetic signatures of an endogamous non-smoking population
Design and Analysis of Microstrip Patch Antenna Array and Electronic Beam Steering Linear Phased Antenna Array with High Directivity for Space Applications
ASTD: The Alternative Splicing and Transcript Diversity database
AbstractThe Alternative Splicing and Transcript Diversity database (ASTD) gives access to a vast collection of alternative transcripts that integrate transcription initiation, polyadenylation and splicing variant data. Alternative transcripts are derived from the mapping of transcribed sequences to the complete human, mouse and rat genomes using an extension of the computational pipeline developed for the ASD (Alternative Splicing Database) and ATD (Alternative Transcript Diversity) databases, which are now superseded by ASTD. For the human genome, ASTD identifies splicing variants, transcription initiation variants and polyadenylation variants in 68%, 68% and 62% of the gene set, respectively, consistent with current estimates for transcription variation. Users can access ASTD through a variety of browsing and query tools, including expression state-based queries for the identification of tissue-specific isoforms. Participating laboratories have experimentally validated a subset of ASTD-predicted alternative splice forms and alternative polyadenylation forms that were not previously reported. The ASTD database can be accessed at http://www.ebi.ac.uk/astd
