294 research outputs found
MetaNetX.org: a website and repository for accessing, analysing and manipulating metabolic networks
Summary: MetaNetX.org is a website for accessing, analysing and manipulating genome-scale metabolic networks (GSMs) as well as biochemical pathways. It consistently integrates data from various public resources and makes the data accessible in a standardized format using a common namespace. Currently, it provides access to hundreds of GSMs and pathways that can be interactively compared (two or more), analysed (e.g. detection of dead-end metabolites and reactions, flux balance analysis or simulation of reaction and gene knockouts), manipulated and exported. Users can also upload their own metabolic models, choose to automatically map them into the common namespace and subsequently make use of the website's functionality. Availability and implementation: MetaNetX.org is available at http://metanetx.org. Contact: [email protected]
Swiss EMBnet node web server
EMBnet is a consortium of collaborating bioinformatics groups located mainly within Europe (http://www.embnet.org). Each member country is represented by a ‘node', a group responsible for the maintenance of local services for their users (e.g. education, training, software, database distribution, technical support, helpdesk). Among these services a web portal with links and access to locally developed and maintained software is essential and different for each node. Our web portal targets biomedical scientists in Switzerland and elsewhere, offering them access to a collection of important sequence analysis tools mirrored from other sites or developed locally. We describe here the Swiss EMBnet node web site (http://www.ch.embnet.org), which presents a number of original services not available anywhere els
pfsearchV3: a code acceleration and heuristic to search PROSITE profiles
Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86_64 hyper-threaded quad-core desktop computer, the new pfsearchV3 is two orders of magnitude faster than the original algorithm. Availability and implementation: Source code and binaries of pfsearchV3 are freely available for download at http://web.expasy.org/pftools/#pfsearchV3, implemented in C and supported on Linux. PROSITE generalized profiles including the heuristic cut-off scores are available at the same address. Contact: [email protected]
The Microbe browser for comparative genomics
The Microbe browser is a web server providing comparative microbial genomics data. It offers comprehensive, integrated data from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) database, displayed along with gene predictions from five software packages. The Microbe browser is daily updated from the source databases and includes all completely sequenced bacterial and archaeal genomes. The data are displayed in an easy-to-use, interactive website based on Ensembl software. The Microbe browser is available at http://microbe.vital-it.ch/. Programmatic access is available through the OMA application programming interface (API) at http://microbe.vital-it.ch/ap
JACOP: A simple and robust method for the automated classification of protein sequences with modular architecture
BACKGROUND: Whole-genome sequencing projects are rapidly producing an enormous number of new sequences. Consequently almost every family of proteins now contains hundreds of members. It has thus become necessary to develop tools, which classify protein sequences automatically and also quickly and reliably. The difficulty of this task is intimately linked to the mechanism by which protein sequences diverge, i.e. by simultaneous residue substitutions, insertions and/or deletions and whole domain reorganisations (duplications/swapping/fusion). RESULTS: Here we present a novel approach, which is based on random sampling of sub-sequences (probes) out of a set of input sequences. The probes are compared to the input sequences, after a normalisation step; the results are used to partition the input sequences into homogeneous groups of proteins. In addition, this method provides information on diagnostic parts of the proteins. The performance of this method is challenged by two data sets. The first one contains the sequences of prokaryotic lyases that could be arranged as a multiple sequence alignment. The second one contains all proteins from Swiss-Prot Release 36 with at least one Src homology 2 (SH2) domain – a classical example for proteins with modular architecture. CONCLUSION: The outcome of our method is robust, highly reproducible as shown using bootstrap and resampling validation procedures. The results are essentially coherent with the biology. This method depends solely on well-established publicly available software and algorithms
Mechanisms of Surface Antigenic Variation in the Human Pathogenic Fungus <i>Pneumocystis jirovecii</i>.
Microbial pathogens commonly escape the human immune system by varying surface proteins. We investigated the mechanisms used for that purpose by <i>Pneumocystis jirovecii</i> This uncultivable fungus is an obligate pulmonary pathogen that in immunocompromised individuals causes pneumonia, a major life-threatening infection. Long-read PacBio sequencing was used to assemble a core of subtelomeres of a single <i>P. jirovecii</i> strain from a bronchoalveolar lavage fluid specimen from a single patient. A total of 113 genes encoding surface proteins were identified, including 28 pseudogenes. These genes formed a subtelomeric gene superfamily, which included five families encoding adhesive glycosylphosphatidylinositol (GPI)-anchored glycoproteins and one family encoding excreted glycoproteins. Numerical analyses suggested that diversification of the glycoproteins relies on mosaic genes created by ectopic recombination and occurs only within each family. DNA motifs suggested that all genes are expressed independently, except those of the family encoding the most abundant surface glycoproteins, which are subject to mutually exclusive expression. PCR analyses showed that exchange of the expressed gene of the latter family occurs frequently, possibly favored by the location of the genes proximal to the telomere because this allows concomitant telomere exchange. Our observations suggest that (i) the <i>P. jirovecii</i> cell surface is made of a complex mixture of different surface proteins, with a majority of a single isoform of the most abundant glycoprotein, (ii) genetic mosaicism within each family ensures variation of the glycoproteins, and (iii) the strategy of the fungus consists of the continuous production of new subpopulations composed of cells that are antigenically different. <b>IMPORTANCE</b> <i>Pneumocystis jirovecii</i> is a fungus causing severe pneumonia in immunocompromised individuals. It is the second most frequent life-threatening invasive fungal infection. We have studied the mechanisms of antigenic variation used by this pathogen to escape the human immune system, a strategy commonly used by pathogenic microorganisms. Using a new DNA sequencing technology generating long reads, we could characterize the highly repetitive gene families encoding the proteins that are present on the cellular surface of this pest. These gene families are localized in the regions close to the ends of all chromosomes, the subtelomeres. Such chromosomal localization was found to favor genetic recombinations between members of each gene family and to allow diversification of these proteins continuously over time. This pathogen seems to use a strategy of antigenic variation consisting of the continuous production of new subpopulations composed of cells that are antigenically different. Such a strategy is unique among human pathogens
Density-based hierarchical clustering of pyro-sequences on a large scale—the case of fungal ITS1
Motivation: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. Results: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. Availability: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. Contact: [email protected] or [email protected]
HitKeeper, a generic software package for hit list management
BACKGROUND: The automated annotation of biological sequences (protein, DNA) relies on the computation of hits (predicted features) on the sequences using various algorithms. Public databases of biological sequences provide a wealth of biological "knowledge", for example manually validated annotations (features) that are located on the sequences, but mining the sequence annotations and especially the predicted and curated features requires dedicated tools. Due to the heterogeneity and diversity of the biological information, it is difficult to handle redundancy, frequent updates, taxonomic information and "private" data together with computational algorithms in a common workflow. RESULTS: We present HitKeeper, a software package that controls the fully automatic handling of multiple biological databases and of hit list calculations on a large scale. The software implements an asynchronous update system that introduces updates and computes hits as soon as new data become available. A query interface enables the user to search sequences by specifying constraints, such as retrieving sequences that contain specific motifs, or a defined arrangement of motifs ("metamotifs"), or filtering based on the taxonomic classification of a sequence. CONCLUSION: The software provides a generic and modular framework to handle the redundancy and incremental updates of biological databases, and an original query language. It is published under the terms and conditions of version 2 of the GNU Public License and available at
RNA Sequencing-Based Genome Reannotation of the Dermatophyte Arthroderma benhamiae and Characterization of Its Secretome and Whole Gene Expression Profile during Infection.
Dermatophytes are the most common agents of superficial mycoses in humans and animals. The aim of the present investigation was to systematically identify the extracellular, possibly secreted, proteins that are putative virulence factors and antigenic molecules of dermatophytes. A complete gene expression profile of Arthroderma benhamiae was obtained during infection of its natural host (guinea pig) using RNA sequencing (RNA-seq) technology. This profile was completed with those of the fungus cultivated in vitro in two media containing either keratin or soy meal protein as the sole source of nitrogen and in Sabouraud medium. More than 60% of transcripts deduced from RNA-seq data differ from those previously deposited for A. benhamiae. Using these RNA-seq data along with an automatic gene annotation procedure, followed by manual curation, we produced a new annotation of the A. benhamiae genome. This annotation comprised 7,405 coding sequences (CDSs), among which only 2,662 were identical to the currently available annotation, 383 were newly identified, and 15 secreted proteins were manually corrected. The expression profile of genes encoding proteins with a signal peptide in infected guinea pigs was found to be very different from that during in vitro growth when using keratin as the substrate. Especially, the sets of the 12 most highly expressed genes encoding proteases with a signal sequence had only the putative vacuolar aspartic protease gene PEP2 in common, during infection and in keratin medium. The most upregulated gene encoding a secreted protease during infection was that encoding subtilisin SUB6, which is a known major allergen in the related dermatophyte Trichophyton rubrum. IMPORTANCE Dermatophytoses (ringworm, jock itch, athlete's foot, and nail infections) are the most common fungal infections, but their virulence mechanisms are poorly understood. Combining transcriptomic data obtained from growth under various culture conditions with data obtained during infection led to a significantly improved genome annotation. About 65% of the protein-encoding genes predicted with our protocol did not match the existing annotation for A. benhamiae. Comparing gene expression during infection on guinea pigs with keratin degradation in vitro, which is supposed to mimic the host environment, revealed the critical importance of using real in vivo conditions for investigating virulence mechanisms. The analysis of genes expressed in vivo, encoding cell surface and secreted proteins, particularly proteases, led to the identification of new allergen and virulence factor candidates
IL CARNICO DI LIERNA (COMO): STRATIGRAFIA E PALEOGEOGRAFIA
The Carnian sedimentary succession of the Lierna area, more than 4OO m thick and belonging to the Coltignone tectonic unit (Grigne Group), has considerable paleogeographic significance, for it documents the western termination of the deltalagoon-carbonate platform depositional system of the Lombard Prealps. The Carnian succession, directly overlying basinal (Perledo-Varenna Formation) to shallow-marine (Lierna Formation) carbonates of Ladinian age, is characterized by deltaic terrigenous deposits (Val Sabbia Sandstone), progressively replaced by lagoonal (Gorno Formation) to peritidal (Breno Formation) limestones. In the studied stratigraphic sections these three lithofacies alternate vertically at decametric scale, and are each characterized by different types of cyclothems at metric scale. High-frequency "punctuated aggradational cycles" (PAC) are best recognized at the top of the Val Sabbia clastic body, where the final transition to the Gorno lagoon is marked by arenite layers containing ferruginous ooids. Grains of this type have nor been previously reported from Southalpine arenites. The petrographic composition of the Val Sabbia Sandstone points to provenance from an active volcanic chain, but abundance of quartzose detritus and occurrence of perthitic K-feldspar and sedimentary rock fragments suggest extensive erosion of the Hercynian basement and Upper Carboniferous to Ladinian sedimentary cover. These peculiar characteristics confirm the existence of a small "Lierna lobe" separated from the deltaic systems of central Lombardy, and fed by a basement- high located southwest of Como lake
- …
