1,614 research outputs found
Nanostructured luminescently labeled nucleic acids
Important and emerging trends at the interface of luminescence, nucleic acids and nanotechnology
are: (i) the conventional luminescence labeling of nucleic acid nanostructures (e.g. DNA tetrahedron);
(ii) the labeling of bulk nucleic acids (e.g. single‐stranded DNA, double‐stranded DNA) with
nanostructured luminescent labels (e.g. copper nanoclusters); and (iii) the labeling of nucleic acid
nanostructures (e.g. origami DNA) with nanostructured luminescent labels (e.g. silver
nanoclusters). This review surveys recent advances in these three different approaches to the
generation of nanostructured luminescently labeled nucleic acids, and includes both direct and
indirect labeling methods
GIVE: portable genome browsers for personal websites.
Growing popularity and diversity of genomic data demand portable and versatile genome browsers. Here, we present an open source programming library called GIVE that facilitates the creation of personalized genome browsers without requiring a system administrator. By inserting HTML tags, one can add to a personal webpage interactive visualization of multiple types of genomics data, including genome annotation, "linear" quantitative data, and genome interaction data. GIVE includes a graphical interface called HUG (HTML Universal Generator) that automatically generates HTML code for displaying user chosen data, which can be copy-pasted into user's personal website or saved and shared with collaborators. GIVE is available at: https://www.givengine.org/
Mapping the <i>Shh</i> long-range regulatory domain
Coordinated gene expression controlled by long-distance enhancers is orchestrated by DNA regulatory sequences involving transcription factors and layers of control mechanisms. The Shh gene and well-established regulators are an example of genomic composition in which enhancers reside in a large desert extending into neighbouring genes to control the spatiotemporal pattern of expression. Exploiting the local hopping activity of the Sleeping Beauty transposon, the lacZ reporter gene was dispersed throughout the Shh region to systematically map the genomic features responsible for expression activity. We found that enhancer activities are retained inside a genomic region that corresponds to the topological associated domain (TAD) defined by Hi-C. This domain of approximately 900 kb is in an open conformation over its length and is generally susceptible to all Shh enhancers. Similar to the distal enhancers, an enhancer residing within the Shh second intron activates the reporter gene located at distances of hundreds of kilobases away, suggesting that both proximal and distal enhancers have the capacity to survey the Shh topological domain to recognise potential promoters. The widely expressed Rnf32 gene lying within the Shh domain evades enhancer activities by a process that may be common among other housekeeping genes that reside in large regulatory domains. Finally, the boundaries of the Shh TAD do not represent the absolute expression limits of enhancer activity, as expression activity is lost stepwise at a number of genomic positions at the verges of these domains
MSR1 repeats modulate gene expression and affect risk of breast and prostate cancer
[Background] MSR1 repeats are a 36–38 bp minisatellite element that have recently been implicated in the regulation of gene expression, through copy number variation (CNV).[Patients and methods] Bioinformatic and experimental methods were used to assess the distribution of MSR1 across the genome, evaluate the regulatory potential of such elements and explore the role of MSR1 elements in cancer, particularly non-familial breast cancer and prostate cancer.[Results] MSR1s are predominately located at chromosome 19 and are functionally enriched in regulatory regions of the genome, particularly regions implicated in short-range regulatory activities (H3K27ac, H3K4me1 and H3K4me3). MSR1-regulated genes were found to have specific molecular roles, such as serine-protease activity (P = 4.80 × 10−7) and ion channel activity (P = 2.7 × 10−4). The kallikrein locus was found to contain a large number of MSR1 clusters, and at least six of these showed CNV. An MSR1 cluster was identified within KLK14, with 9 and 11 copies being normal variants. A significant association with the 9-copy allele and non-familial breast cancer was found in two independent populations (P = 0.004; P = 0.03). In the white British population, the minor allele conferred an increased risk of 1.21–3.51 times for all non-familial disease, or 1.7–5.3 times in early-onset disease. The 9-copy allele was also found to be associated with increased risk of prostate cancer in an independent population (odds ratio = 1.27–1.56; P =0.009).[Conclusions] MSR1 repeats act as molecular switches that modulate gene expression. It is likely that CNV of MSR1 will affect risk of development of various forms of cancer, including that of breast and prostate. The MSR1 cluster at KLK14 represents the strongest risk factor identified to date in non-familial breast cancer and a significant risk factor for prostate cancer. Analysis of MSR1 genotype will allow development of precise stratification of disease risk and provide a novel target for therapeutic agents.Prostate cancer study is supported by an National Health and Medical Research Council (NHMRC) grant and Career Development Fellowship APP1090505 to JB. The Australian Prostate Cancer BioResource is supported by the NHMRC Enabling Grant APP614296 and by a grant from the Prostate Cancer Foundation, Australia.Peer reviewe
Novel Bayes Factors That Capture Expert Uncertainty in Prior Density Specification in Genetic Association Studies.
Bayes factors (BFs) are becoming increasingly important tools in genetic association studies, partly because they provide a natural framework for including prior information. The Wakefield BF (WBF) approximation is easy to calculate and assumes a normal prior on the log odds ratio (logOR) with a mean of zero. However, the prior variance (W) must be specified. Because of the potentially high sensitivity of the WBF to the choice of W, we propose several new BF approximations with logOR ∼N(0,W), but allow W to take a probability distribution rather than a fixed value. We provide several prior distributions for W which lead to BFs that can be calculated easily in freely available software packages. These priors allow a wide range of densities for W and provide considerable flexibility. We examine some properties of the priors and BFs and show how to determine the most appropriate prior based on elicited quantiles of the prior odds ratio (OR). We show by simulation that our novel BFs have superior true-positive rates at low false-positive rates compared to those from both P-value and WBF analyses across a range of sample sizes and ORs. We give an example of utilizing our BFs to fine-map the CASP8 region using genotype data on approximately 46,000 breast cancer case and 43,000 healthy control samples from the Collaborative Oncological Gene-environment Study (COGS) Consortium, and compare the single-nucleotide polymorphism ranks to those obtained using WBFs and P-values from univariate logistic regression
Revealing mammalian evolutionary relationships by comparative analysis of gene clusters
Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events
The Escherichia coli transcriptome mostly consists of independently regulated modules
Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome
Genome-wide associations of gene expression variation in humans
The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs) with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis-) to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I) HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level
Genetic determinants of co-accessible chromatin regions in activated T cells across humans.
Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression
Law of Genome Evolution Direction : Coding Information Quantity Grows
The problem of the directionality of genome evolution is studied. Based on
the analysis of C-value paradox and the evolution of genome size we propose
that the function-coding information quantity of a genome always grows in the
course of evolution through sequence duplication, expansion of code, and gene
transfer from outside. The function-coding information quantity of a genome
consists of two parts, p-coding information quantity which encodes functional
protein and n-coding information quantity which encodes other functional
elements except amino acid sequence. The evidences on the evolutionary law
about the function-coding information quantity are listed. The needs of
function is the motive force for the expansion of coding information quantity
and the information quantity expansion is the way to make functional innovation
and extension for a species. So, the increase of coding information quantity of
a genome is a measure of the acquired new function and it determines the
directionality of genome evolution.Comment: 16 page
- …
