Search CORE

Many Genbank Entries for Complete Microbial Genomes Violate the Genbank Standard

Author: Karp Peter D.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2001
Field of study

A survey of Genbank entries for complete microbial genomes reveals that the majority do not conform to the Genbank standard. Typical deviations from the Genbank standard include records with information in incorrect fields, addition of extraneous and confusing information within a field, and omission of useful fields. This situation results from two principal causes: genome centres do not submit Genbank records in the proper form and the Genbank, EMBL and DDBJ staffs do not enforce the database standards that they have defined

Web-based metabolic network visualization with a zooming user interface

Author: Karp Peter D
Latendresse Mario
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Displaying complex metabolic-map diagrams, for Web browsers, and allowing users to interact with them for querying and overlaying expression data over them is challenging. Description We present a Web-based metabolic-map diagram, which can be interactively explored by the user, called the <it>Cellular Overview</it>. The main characteristic of this application is the zooming user interface enabling the user to focus on appropriate granularities of the network at will. Various searching commands are available to visually highlight sets of reactions, pathways, enzymes, metabolites, and so on. Expression data from single or multiple experiments can be overlaid on the diagram, which we call the Omics Viewer capability. The application provides Web services to highlight the diagram and to invoke the <it>Omics Viewer</it>. This application is entirely written in JavaScript for the client browsers and connect to a Pathway Tools Web server to retrieve data and diagrams. It uses the OpenLayers library to display tiled diagrams. Conclusions This new online tool is capable of displaying large and complex metabolic-map diagrams in a very interactive manner. This application is available as part of the Pathway Tools software that powers multiple metabolic databases including <monospace>Biocyc.org</monospace>: The Cellular Overview is accessible under the <monospace>Tools</monospace> menu.</p

Springer - Publisher Connector

The Pathway Tools cellular overview diagram and Omics Viewer

Author: Karp Peter D.
Paley Suzanne M.
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

The Pathway Tools cellular overview diagram is a visual representation of the biochemical network of an organism. The overview is automatically created from a Pathway/Genome Database describing that organism. The cellular overview includes metabolic, transport and signaling pathways, and other membrane and periplasmic proteins. Pathway Tools supports interrogation and exploration of cellular biochemical networks through the overview diagram. Furthermore, a software component called the Omics Viewer provides visual analysis of whole-organism datasets using the overview diagram as an organizing framework. For example, gene expression and metabolomics measurements, alone or in combination, can be painted onto the overview, as can computed whole-organism datasets, such as predicted reaction-flux values. The cellular overview and Omics Viewer provide a mechanism whereby biologists can apply the pattern-recognition capabilities of the human visual system to analyze large-scale datasets in a biologically meaningful context. SRI's BioCyc.org website provides overview diagrams for more than 200 organisms. This article describes enhancements to the overview made since a 1999 publication, including the automatic layout capability, expansion of the cellular machinery that it includes, new semantic zooming and poster-generating capabilities, and extension of the Omics Viewer to support painting of metabolites, animations and zooming to individual pathway diagrams

CiteSeerX

EcoCyc: fusing model organism databases with systems biology.

Author: Bonavides-Martínez César
Collado-Vides Julio
Fulcher Carol
Gama-Castro Socorro
Gunsalus Robert P
Huerta Araceli M
Karp Peter D
Keseler Ingrid M
Kothari Anamika
Krummenacker Markus
Latendresse Mario
Mackie Amanda
Muñiz-Rascado Luis
Ong Quang
Paley Suzanne
Paulsen Ian
Peralta-Gil Martin
Santos-Zavaleta Alberto
Schröder Imke
Shearer Alexander G
Subhraveti Pallavi
Travers Mike
Weerasinghe Deepika
Weiss Verena
Publication venue: eScholarship, University of California
Publication date: 07/11/2012
Field of study

EcoCyc (http://EcoCyc.org) is a model organism database built on the genome sequence of Escherichia coli K-12 MG1655. Expert manual curation of the functions of individual E. coli gene products in EcoCyc has been based on information found in the experimental literature for E. coli K-12-derived strains. Updates to EcoCyc content continue to improve the comprehensive picture of E. coli biology. The utility of EcoCyc is enhanced by new tools available on the EcoCyc web site, and the development of EcoCyc as a teaching tool is increasing the impact of the knowledge collected in EcoCyc

eScholarship - University of California

Research from Macquarie University

ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus

Author: Bonnie Berger
Burkhard Rost
Diane Kovats
Michal Linial
Pardis Sabeti
Peter D. Karp
RH Lathrop
RH Lathrop
SK Gire
Thomas Lengauer
Winston Hide
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology [ISMB] 2016, Orlando, Florida)

Public Library of Science (PLOS)

Harvard University - DASH

DSpace@MIT

White Rose Research Online

MPG.PuRe

A systematic study of genome context methods: calibration, normalization and combination

Author: Dale Joseph M
Ferrer Luciana
Karp Peter D
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patterns of existence and relative locations of the homologs of those genes in a set of reference genomes. Much work has been done in the application of these methods to different bioinformatics tasks, but few papers present a systematic study of the methods and their combination necessary for their optimal use. Results We present a thorough study of the four main families of genome context methods found in the literature: phylogenetic profile, gene fusion, gene cluster, and gene neighbor. We find that for most organisms the gene neighbor method outperforms the phylogenetic profile method by as much as 40% in sensitivity, being competitive with the gene cluster method at low sensitivities. Gene fusion is generally the worst performing of the four methods. A thorough exploration of the parameter space for each method is performed and results across different target organisms are presented. We propose the use of normalization procedures as those used on microarray data for the genome context scores. We show that substantial gains can be achieved from the use of a simple normalization technique. In particular, the sensitivity of the phylogenetic profile method is improved by around 25% after normalization, resulting, to our knowledge, on the best-performing phylogenetic profile system in the literature. Finally, we show results from combining the various genome context methods into a single score. When using a cross-validation procedure to train the combiners, with both original and normalized scores as input, a decision tree combiner results in gains of up to 20% with respect to the gene neighbor method. Overall, this represents a gain of around 15% over what can be considered the state of the art in this area: the four original genome context methods combined using a procedure like that used in the STRING database. Unfortunately, we find that these gains disappear when the combiner is trained only with organisms that are phylogenetically distant from the target organism. Conclusions Our experiments indicate that gene neighbor is the best individual genome context method and that gains from the combination of individual methods are very sensitive to the training data used to obtain the combiner's parameters. If adequate training data is not available, using the gene neighbor score by itself instead of a combined score might be the best choice.</p

Springer - Publisher Connector