74 research outputs found

    Web-based Tools for the Analysis of DNA Microarrays

    Get PDF
    End of project reportDNA microarrays are widely used for gene expression profiling. Raw data resulting from microarray experiments, however, tends to be very noisy and there are many sources of technical variation and bias. This raw data needs to be quality assessed and interactively preprocessed to minimise variation before statistical analysis in order to achieve meaningful result. Therefore microarray analysis requires a combination of visualisation and statistical tools, which vary depending on what microarray platform or experimental design is used.Bioconductor is an existing open source software project that attempts to facilitate analysis of genomic data. It is a collection of packages for the statistical programming language R. Bioconductor is particularly useful in analyzing microarray experiments. The problem is that the R programming language’s command line interface is intimidating to many users who do not have a strong background in computing. This often leads to a situation where biologists will resort to using commercial software which often uses antiquated and much less effective statistical techniques, as well as being expensively priced. This project aims to bridge this gap by providing a user friendly web-based interface to the cutting edge statistical techniques of Bioconductor

    Machine learning and data mining frameworks for predicting drug response in cancer:An overview and a novel <i>in silico</i> screening process based on association rule mining

    Get PDF
    A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.</p

    Statistical and integrative system-level analysis of DNA methylation data

    Get PDF
    Epigenetics plays a key role in cellular development and function. Alterations to the epigenome are thought to capture and mediate the effects of genetic and environmental risk factors on complex disease. Currently, DNA methylation is the only epigenetic mark that can be measured reliably and genome-wide in large numbers of samples. This Review discusses some of the key statistical challenges and algorithms associated with drawing inferences from DNA methylation data, including cell-type heterogeneity, feature selection, reverse causation and system-level analyses that require integration with other data types such as gene expression, genotype, transcription factor binding and other epigenetic information

    Bioconductorbuntu: a linux distribution that implements a web-based dna microarray analysis server

    No full text
    BioconductorBuntu is a custom distribution of Ubuntu Linux that automatically installs a server-side microarray processing environment, providing a user-friendly web-based GUI to many of the tools developed by the Bioconductor Project, accessible locally or across a network. System installation is via booting off a CD image or by using a Debian package provided to upgrade an existing Ubuntu installation. In its current version, several microarray analysis pipelines are supported including oligonucleotide, dual-or single-dye experiments, including post-processing with Gene Set Enrichment Analysis. BioconductorBuntu is designed to be extensible, by server-side integration of further relevant Bioconductor modules as required, facilitated by its straightforward underlying Python-based infrastructure. BioconductorBuntu offers an ideal environment for the development of processing procedures to facilitate the analysis of next-generation sequencing datasets

    Bioconductorbuntu: a linux distribution that implements a web-based dna microarray analysis server

    No full text
    BioconductorBuntu is a custom distribution of Ubuntu Linux that automatically installs a server-side microarray processing environment, providing a user-friendly web-based GUI to many of the tools developed by the Bioconductor Project, accessible locally or across a network. System installation is via booting off a CD image or by using a Debian package provided to upgrade an existing Ubuntu installation. In its current version, several microarray analysis pipelines are supported including oligonucleotide, dual-or single-dye experiments, including post-processing with Gene Set Enrichment Analysis. BioconductorBuntu is designed to be extensible, by server-side integration of further relevant Bioconductor modules as required, facilitated by its straightforward underlying Python-based infrastructure. BioconductorBuntu offers an ideal environment for the development of processing procedures to facilitate the analysis of next-generation sequencing datasets

    Web-based Tools for the Analysis of DNA Microarrays

    No full text
    End of project reportDNA microarrays are widely used for gene expression profiling. Raw data resulting from microarray experiments, however, tends to be very noisy and there are many sources of technical variation and bias. This raw data needs to be quality assessed and interactively preprocessed to minimise variation before statistical analysis in order to achieve meaningful result. Therefore microarray analysis requires a combination of visualisation and statistical tools, which vary depending on what microarray platform or experimental design is used.Bioconductor is an existing open source software project that attempts to facilitate analysis of genomic data. It is a collection of packages for the statistical programming language R. Bioconductor is particularly useful in analyzing microarray experiments. The problem is that the R programming language’s command line interface is intimidating to many users who do not have a strong background in computing. This often leads to a situation where biologists will resort to using commercial software which often uses antiquated and much less effective statistical techniques, as well as being expensively priced. This project aims to bridge this gap by providing a user friendly web-based interface to the cutting edge statistical techniques of Bioconductor

    Transferlernen in der Biomedizin

    No full text
    corecore