149 research outputs found
Primers for Castilleja and their Utility Across Orobanchaceae: II. Single‐copy nuclear loci
Premise of the study: We developed primers targeting nuclear loci in Castilleja with the goal of reconstructing the evolutionary history of this challenging clade. These primers were tested across other major clades in Orobanchaceae to assess their broader utility.Methods and Results: We assembled low-coverage genomes for three taxa in Castilleja and developed primer combinations for the single-copy conserved ortholog set (COSII) and the pentatricopeptide repeat (PPR) gene family. These primer combinations were designed to take advantage of the Fluidigm microfluidic PCR platform and are well suited for high-throughput sequencing applications. Eighty-seven primers were designed for Castilleja, and 27 were found to have broader utility in Orobanchaceae.Conclusions: These results demonstrate the utility of these primers, not only across Castilleja, but for other lineages within Orobanchaceae as well. This expanded molecular toolkit will be an asset to future phylogenetic studies in Castilleja and throughout Orobanchaceae
Toward standard practices for sharing computer code and programs in neuroscience
Computational techniques are central in many areas of neuroscience and are relatively easy to share. This paper describes why computer programs underlying scientific publications should be shared and lists simple steps for sharing. Together with ongoing efforts in data sharing, this should aid reproducibility of research.This article is based on discussions from a workshop to encourage sharing in neuroscience, held in Cambridge, UK, December 2014. It was financially supported and organized by the International Neuroinformatics Coordinating Facility (http://www.incf.org), with additional support from the Software Sustainability institute (http://www.software.ac.uk). M.H. was supported by funds from the German federal state of Saxony-Anhalt and the European Regional Development Fund (ERDF), Project: Center for Behavioral Brain Sciences
The Effects of Growth Hormone and Insulin-Like Growth Factor-1 Treatments on Hepatic Gene Expression in Obese and Diabetic Mice with Nonalcoholic Fatty Liver Disease
Accounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids
Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty--ADU), which complicates the calculation of important quantities such as allele frequencies. Here we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high throughput sequencing data in the form of read counts.We bridge the gap from data collection (using restriction enzyme based techniques [e.g., GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package POLYFREQS and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity.</jats:p
SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data
AbstractMotivation:Genotyping and parameter estimation using high throughput sequencing data are everyday tasks for population geneticists, but methods developed for diploids are typically not applicable to polyploid taxa. This is due to their duplicated chromosomes, as well as the complex patterns of allelic exchange that often accompany whole genome duplication (WGD) events. For WGDs within a single lineage (auto polyploids), inbreeding can result from mixed mating and/or double reduction. For WGDs that involve hybridization (allopolyploids), alleles are typically inherited through independently segregating subgenomes.Results:We present two new models for estimating genotypes and population genetic parameters from genotype likelihoods for auto- and allopolyploids. We then use simulations to compare these models to existing approaches at varying depths of sequencing coverage and ploidy levels. These simulations show that our models typically have lower levels of estimation error for genotype and parameter estimates, especially when sequencing coverage is low. Finally, we also apply these models to two empirical data sets from the literature. Overall, we show that the use of genotype likelihoods to model non-standard inheritance patterns is a promising approach for conducting population genomic inferences in polyploids.Availability:A C++ program, EBG, is provided to perform inference using the models we describe. It is available under the GNU GPLv3 on GitHub:https://github.com/pblischak/polyploid-genotyping.Contact: [email protected].</jats:sec
Creating and sharing reproducible research code the workflowr way
Making scientific analyses reproducible, well documented, and easily shareable is crucial to maximizing their impact and ensuring that others can build on them. However, accomplishing these goals is not easy, requiring careful attention to organization, workflow, and familiarity with tools that are not a regular part of every scientist's toolbox. We have developed an R package,workflowr, to help all scientists, regardless of background, overcome these challenges.Workflowraims to instill a particular "workflow" — a sequence of steps to be repeated and integrated into research practice — that helps make projects more reproducible and accessible.This workflow integrates four key elements: (1) version control (viaGit); (2) literate programming (via R Markdown); (3) automatic checks and safeguards that improve code reproducibility; and (4) sharing code and results via a browsable website. These features exploit powerful existing tools, whose mastery would take considerable study. However, theworkflowrinterface is simple enough that novice users can quickly enjoy its many benefits. By simply following theworkflowr "workflow", R users can create projects whose results, figures, and development history are easily accessible on a static website — thereby conveniently shareable with collaborators by sending them a URL — and accompanied by source code and reproducibility safeguards. TheworkflowrR package is open source and available on CRAN, with full documentation and source code available athttps://github.com/jdblischak/workflowr.</ns4:p
Bann_spdelim_scripts
All python scripts associated with the genomic analyses for this stud
Data from: Accounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids
Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high-throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty–ADU), which complicates the calculation of important quantities such as allele frequencies. Here, we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high-throughput sequencing data in the form of read counts. We bridge the gap from data collection (using restriction enzyme based techniques [e.g. GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package polyfreqs and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity
- …
