454 research outputs found
Call for an enzyme genomics initiative
I propose an Enzyme Genomics Initiative, the goal of which is to obtain at least one protein sequence for each enzyme that has previously been characterized biochemically. There are 1,437 enzyme activities for which Enzyme Commission (EC) numbers have been assigned but no sequence can be found in public protein-sequence databases
Many Genbank Entries for Complete Microbial Genomes Violate the Genbank Standard
A survey of Genbank entries for complete microbial genomes reveals that the majority do
not conform to the Genbank standard. Typical deviations from the Genbank standard
include records with information in incorrect fields, addition of extraneous and confusing
information within a field, and omission of useful fields. This situation results from two
principal causes: genome centres do not submit Genbank records in the proper form and
the Genbank, EMBL and DDBJ staffs do not enforce the database standards that they
have defined
Web-based metabolic network visualization with a zooming user interface
<p>Abstract</p> <p>Background</p> <p>Displaying complex metabolic-map diagrams, for Web browsers, and allowing users to interact with them for querying and overlaying expression data over them is challenging.</p> <p>Description</p> <p>We present a Web-based metabolic-map diagram, which can be interactively explored by the user, called the <it>Cellular Overview</it>. The main characteristic of this application is the zooming user interface enabling the user to focus on appropriate granularities of the network at will. Various searching commands are available to visually highlight sets of reactions, pathways, enzymes, metabolites, and so on. Expression data from single or multiple experiments can be overlaid on the diagram, which we call the Omics Viewer capability. The application provides Web services to highlight the diagram and to invoke the <it>Omics Viewer</it>. This application is entirely written in JavaScript for the client browsers and connect to a Pathway Tools Web server to retrieve data and diagrams. It uses the OpenLayers library to display tiled diagrams.</p> <p>Conclusions</p> <p>This new online tool is capable of displaying large and complex metabolic-map diagrams in a very interactive manner. This application is available as part of the Pathway Tools software that powers multiple metabolic databases including <monospace>Biocyc.org</monospace>: The Cellular Overview is accessible under the <monospace>Tools</monospace> menu.</p
The Pathway Tools cellular overview diagram and Omics Viewer
The Pathway Tools cellular overview diagram is a visual representation of the biochemical network of an organism. The overview is automatically created from a Pathway/Genome Database describing that organism. The cellular overview includes metabolic, transport and signaling pathways, and other membrane and periplasmic proteins. Pathway Tools supports interrogation and exploration of cellular biochemical networks through the overview diagram. Furthermore, a software component called the Omics Viewer provides visual analysis of whole-organism datasets using the overview diagram as an organizing framework. For example, gene expression and metabolomics measurements, alone or in combination, can be painted onto the overview, as can computed whole-organism datasets, such as predicted reaction-flux values. The cellular overview and Omics Viewer provide a mechanism whereby biologists can apply the pattern-recognition capabilities of the human visual system to analyze large-scale datasets in a biologically meaningful context. SRI's BioCyc.org website provides overview diagrams for more than 200 organisms. This article describes enhancements to the overview made since a 1999 publication, including the automatic layout capability, expansion of the cellular machinery that it includes, new semantic zooming and poster-generating capabilities, and extension of the Omics Viewer to support painting of metabolites, animations and zooming to individual pathway diagrams
EcoCyc: fusing model organism databases with systems biology.
EcoCyc (http://EcoCyc.org) is a model organism database built on the genome sequence of Escherichia coli K-12 MG1655. Expert manual curation of the functions of individual E. coli gene products in EcoCyc has been based on information found in the experimental literature for E. coli K-12-derived strains. Updates to EcoCyc content continue to improve the comprehensive picture of E. coli biology. The utility of EcoCyc is enhanced by new tools available on the EcoCyc web site, and the development of EcoCyc as a teaching tool is increasing the impact of the knowledge collected in EcoCyc
ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus
Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology [ISMB] 2016, Orlando, Florida)
A systematic study of genome context methods: calibration, normalization and combination
<p>Abstract</p> <p>Background</p> <p>Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patterns of existence and relative locations of the homologs of those genes in a set of reference genomes. Much work has been done in the application of these methods to different bioinformatics tasks, but few papers present a systematic study of the methods and their combination necessary for their optimal use.</p> <p>Results</p> <p>We present a thorough study of the four main families of genome context methods found in the literature: phylogenetic profile, gene fusion, gene cluster, and gene neighbor. We find that for most organisms the gene neighbor method outperforms the phylogenetic profile method by as much as 40% in sensitivity, being competitive with the gene cluster method at low sensitivities. Gene fusion is generally the worst performing of the four methods. A thorough exploration of the parameter space for each method is performed and results across different target organisms are presented.</p> <p>We propose the use of normalization procedures as those used on microarray data for the genome context scores. We show that substantial gains can be achieved from the use of a simple normalization technique. In particular, the sensitivity of the phylogenetic profile method is improved by around 25% after normalization, resulting, to our knowledge, on the best-performing phylogenetic profile system in the literature.</p> <p>Finally, we show results from combining the various genome context methods into a single score. When using a cross-validation procedure to train the combiners, with both original and normalized scores as input, a decision tree combiner results in gains of up to 20% with respect to the gene neighbor method. Overall, this represents a gain of around 15% over what can be considered the state of the art in this area: the four original genome context methods combined using a procedure like that used in the STRING database. Unfortunately, we find that these gains disappear when the combiner is trained only with organisms that are phylogenetically distant from the target organism.</p> <p>Conclusions</p> <p>Our experiments indicate that gene neighbor is the best individual genome context method and that gains from the combination of individual methods are very sensitive to the training data used to obtain the combiner's parameters. If adequate training data is not available, using the gene neighbor score by itself instead of a combined score might be the best choice.</p
- …
