150 research outputs found

    Addressing Challenges in a Graph-Based Analysis of High-Throughput Biological Data

    Get PDF
    Graph-based methods used in the analysis of DNA microarray technology can be powerful tools in the elucidation of biological relationships. As these methods are developed and applied to various types of data, challenges arise that test the limits of current algorithms. These challenges arise in all phases of data analysis: data normalization, modeling biological networks, and interpreting results. Spectral graph theory methods are investigated as means of threshold selection, a key step in constructing graphical models of biological data. Also important in constructing graphs is the selection of an appropriate gene-gene similarity metric, and an overview of similarity profiles for some biological data sets is present, along with a similarity thresholding method based upon structural properties of random graphs. The identification of altered relationships between two or more conditions is a goal of many microarray gene expression studies. Clique-based methods can identify sets of coexpressed genes within each group, but additional computational methods are required to uncover the differential relationships and sets of genes changing together between groups. Differential filters are reviewed to highlight those changing interactions and sets of changing genes. The effect of various normalization methods on these differential results is also studied. Finally, how methods commonly used in the analysis of gene expression data can be used to investigate relationships in noisy and incomplete historical ecosystem data is explored

    Emerging Infectious Disease leads to Rapid Population Decline of Common British Birds

    Get PDF
    Emerging infectious diseases are increasingly cited as threats to wildlife, livestock and humans alike. They can threaten geographically isolated or critically endangered wildlife populations; however, relatively few studies have clearly demonstrated the extent to which emerging diseases can impact populations of common wildlife species. Here, we report the impact of an emerging protozoal disease on British populations of greenfinch Carduelis chloris and chaffinch Fringilla coelebs, two of the most common birds in Britain. Morphological and molecular analyses showed this to be due to Trichomonas gallinae. Trichomonosis emerged as a novel fatal disease of finches in Britain in 2005 and rapidly became epidemic within greenfinch, and to a lesser extent chaffinch, populations in 2006. By 2007, breeding populations of greenfinches and chaffinches in the geographic region of highest disease incidence had decreased by 35% and 21% respectively, representing mortality in excess of half a million birds. In contrast, declines were less pronounced or absent in these species in regions where the disease was found in intermediate or low incidence. Also, populations of dunnock Prunella modularis, which similarly feeds in gardens, but in which T. gallinae was rarely recorded, did not decline. This is the first trichomonosis epidemic reported in the scientific literature to negatively impact populations of free-ranging non-columbiform species, and such levels of mortality and decline due to an emerging infectious disease are unprecedented in British wild bird populations. This disease emergence event demonstrates the potential for a protozoan parasite to jump avian host taxonomic groups with dramatic effect over a short time period

    Using genome-wide associations to identify metabolic pathways involved in maize aflatoxin accumulation resistance

    Get PDF
    BACKGROUND: Aflatoxin is a potent carcinogen that can contaminate grain infected with the fungus Aspergillus flavus. However, resistance to aflatoxin accumulation in maize is a complex trait with low heritability. Here, two complementary analyses were performed to better understand the mechanisms involved. The first coupled results of a genome-wide association study (GWAS) that accounted for linkage disequilibrium among single nucleotide polymorphisms (SNPs) with gene-set enrichment for a pathway-based approach. The rationale was that the cumulative effects of genes in a pathway would give insight into genetic differences that distinguish resistant from susceptible lines of maize. The second involved finding non-pathway genes close to the most significant SNP-trait associations with the greatest effect on reducing aflatoxin in multiple environments. Unlike conventional GWAS, the latter analysis emphasized multiple aspects of SNP-trait associations rather than just significance and was performed because of the high genotype x environment variability exhibited by this trait. RESULTS: The most significant metabolic pathway identified was jasmonic acid (JA) biosynthesis. Specifically, there was at least one allelic variant for each step in the JA biosynthesis pathway that conferred an incremental decrease to the level of aflatoxin observed among the inbred lines in the GWAS panel. Several non-pathway genes were also consistently associated with lowered aflatoxin levels. Those with predicted functions related to defense were: leucine-rich repeat protein kinase, expansin B3, reversion-to-ethylene sensitivity1, adaptor protein complex2, and a multidrug and toxic compound extrusion protein. CONCLUSIONS: Our genetic analysis provided strong evidence for several genes that were associated with aflatoxin resistance. Inbred lines that exhibited lower levels of aflatoxin accumulation tended to share similar haplotypes for genes specifically in the pathway of JA biosynthesis, along with several non-pathway genes with putative defense-related functions. Knowledge gained from these two complementary analyses has improved our understanding of population differences in aflatoxin resistance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1874-9) contains supplementary material, which is available to authorized users

    A systematic comparison of genome-scale clustering algorithms

    Get PDF
    Background: A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods: For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each clusters agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results: Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions: Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further development and application of combinatorial strategies is warranted

    No-Boundary Thinking in Bioinformatics

    Get PDF
    The following sections are included:Bioinformatics is a Mature DisciplineThe Golden Era of Bioinformatics Has BegunNo-Boundary Thinking in BioinformaticsReference

    Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

    Get PDF
    Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., “guilt-by-association”). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response

    The effect of airway management on CPR quality in the PARAMEDIC2 randomised controlled trial

    Get PDF
    INTRODUCTION Good quality basic life support (BLS) is associated with improved outcome from cardiac arrest. Chest compression fraction (CCF) is a BLS quality indicator, which may be influenced by the type of airway used. We aimed to assess CCF according to the airway strategy in the PARAMEDIC2 study: no advanced airway, supraglottic airway (SGA), tracheal intubation, or a combination of the two. Our hypothesis was that tracheal intubation was associated with a decrease in the CCF compared with alternative airway management strategies. METHODS PARAMEDIC2 was a multicentre double-blinded placebo-controlled trial of adrenaline vs placebo in out-of-hospital cardiac arrest. Data showing compression rate and ratio from patients recruited by London Ambulance Service (LAS) as part of this study was collated and analysed according to the advanced airway used during the resuscitation attempt. RESULTS CPR process data were available from 286/ 2058 (13.9%) of the total patients recruited by LAS. The mean compression rate for the first 5 min of data recording was the same in all groups (P = 0.272) and ranged from 104.2 (95%CI of mean: 100.5, 107.8) min to 108.0 (95%CI of mean: 105.1, 108.3) min. The mean compression fraction was also similar across all groups (P = 0.159) and ranged between 74.7% and 78.4%. There was no difference in the compression rates and fractions across the airway management groups, regardless of the duration of CPR. CONCLUSION There was no significant difference in the compression fraction associated with the airway management strategy

    Threshold selection in gene co-expression networks using spectral graph theory techniques

    Get PDF
    Abstract Background Gene co-expression networks are often constructed by computing some measure of similarity between expression levels of gene transcripts and subsequently applying a high-pass filter to remove all but the most likely biologically-significant relationships. The selection of this expression threshold necessarily has a significant effect on any conclusions derived from the resulting network. Many approaches have been taken to choose an appropriate threshold, among them computing levels of statistical significance, accepting only the top one percent of relationships, and selecting an arbitrary expression cutoff. Results We apply spectral graph theory methods to develop a systematic method for threshold selection. Eigenvalues and eigenvectors are computed for a transformation of the adjacency matrix of the network constructed at various threshold values. From these, we use a basic spectral clustering method to examine the set of gene-gene relationships and select a threshold dependent upon the community structure of the data. This approach is applied to two well-studied microarray data sets from Homo sapiens and Saccharomyces cerevisiae. Conclusion This method presents a systematic, data-based alternative to using more artificial cutoff values and results in a more conservative approach to threshold selection than some other popular techniques such as retaining only statistically-significant relationships or setting a cutoff to include a percentage of the highest correlations
    corecore