641 research outputs found
Bioinformàtica
La recerca en biologia no es pot entendre avui sense la computació. A causa, sobretot,
del desenvolupament de les tecnologies genòmiques, la biologia ha passat en molt poc
temps, de ser una ciència en la qual l'esforç humà s'orientava principalment envers l'obtenció
d'unes poques dades, a ser una ciència que genera un volum enorme de dades sense
pràcticament intervenció humana. L'esforç de l'investigador s'ha desplaçat, en conseqüència,
de la producció a l'anàlisi de les dades. I és en aquest desplaçament en què els mètodes
informàtics tenen un paper essencial, tant en la planificació dels experiments com en la
seva execució i, sobretot, en l'emmagatzematge i anàlisi dels resultats. Aquests mètodes
configuren una nova disciplina científica, que anomenem bioinformàtica. En aquest article
repassarem, des d'una perspectiva històrica, els fonaments d'aquesta disciplina, que s'articulen
al voltant del concepte, entès de manera molt genèrica, de alineament i similitud entre
seqüències.Nowadays, research in biology can not be understood without computation. Due to the
development of the genomic technologies, biology has been transformed in a very short
period of time, from being a science in which the human effort was mainly oriented towards
data gathering to being a science that generates a huge volume of data with little
(or no) human intervention. The effort of researchers has, consequently, moved away from
data production towards data analysis. Computational methods play an essential role to
cope with this transformation: in the planning of the experiments, as well as in their execution,
and, especially, in the storage and analysis of their results. These methods configure a new scientific discipline named bioinformatics. In this article we review from a historical
perspective the foundations of this discipline, which articulate around the generic concept
of sequence alignment and similarity
Bioinformática ¿una ciencia sin científicos?
El Proyecto Genoma Humano ha catalizado una presencia sin precedentes de la investigación en biología en los medios de comunicación. Este impacto mediático no es gratuito. El conocimiento de la secuencia de nucleótidos del genoma humano y de la secuencia de aminoácidos de las proteínas codificadas en ese genoma tendrá, se dice, un impacto extraordinario en la medicina, la agricultura y en muchos procesos industriales. Tendrá, en consecuencia, repercursiones económicas, sociales y quizás, incluso, políticas. En definitiva afectará profundamente nuestras vidas y es lógico que despierte nuestro interés. = The Human Genome Project has promoted an unprecedented presence of information on biological research in the media. This is not a gratuitous impact. It is widely believed that the accrued knowledge on human genome nucleotide sequences and on amino acid sequences of proteins codified by our genome will have an exceptional impact on medical sciences, agricultural sciences and many industrial processes. That is, it will cause financial, social and perhaps even political repercussions. In other words, it will deeply affect our lives, and thus is worthy of our interest
Recommended from our members
Performance and Scalability of Discriminative Metrics for Comparative Gene Identification in 12 Drosophila Genomes
Comparative genomics of multiple related species is a powerful methodology for the discovery of functional genomic elements, and its power should increase with the number of species compared. Here, we use 12 Drosophila genomes to study the power of comparative genomics metrics to distinguish between protein-coding and non-coding regions. First, we study the relative power of different comparative metrics and their relationship to single-species metrics. We find that even relatively simple multi-species metrics robustly outperform advanced single-species metrics, especially for shorter exons (≤240 nt), which are common in animal genomes. Moreover, the two capture largely independent features of protein-coding genes, with different sensitivity/specificity trade-offs, such that their combinations lead to even greater discriminatory power. In addition, we study how discovery power scales with the number and phylogenetic distance of the genomes compared. We find that species at a broad range of distances are comparably effective informants for pairwise comparative gene identification, but that these are surpassed by multi-species comparisons at similar evolutionary divergence. In particular, while pairwise discovery power plateaued at larger distances and never outperformed the most advanced single-species metrics, multi-species comparisons continued to benefit even from the most distant species with no apparent saturation. Last, we find that genes in functional categories typically considered fast-evolving can nonetheless be recovered at very high rates using comparative methods. Our results have implications for comparative genomics analyses in any species, including the human
In silico meets in vivo
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics 'In silico Biology: Gene Discovery and Systems Genomics', Atlanta, USA, 15-17 November, 2007
Erratum to: ‘DECKO: Single-oligo, dual-CRISPR deletion of genomic elements including long non-coding RNAs’
Comparative gene finding in chicken indicates that we are closing in on the set of multi-exonic widely expressed human genes
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT-PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of gene
Multiple non-collinear TF-map alignments of promoter regions
<p>Abstract</p> <p>Background</p> <p>The analysis of the promoter sequence of genes with similar expression patterns is a basic tool to annotate common regulatory elements. Multiple sequence alignments are on the basis of most comparative approaches. The characterization of regulatory regions from co-expressed genes at the sequence level, however, does not yield satisfactory results in many occasions as promoter regions of genes sharing similar expression programs often do not show nucleotide sequence conservation.</p> <p>Results</p> <p>In a recent approach to circumvent this limitation, we proposed to align the maps of predicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of two related promoters, taking into account the label of the corresponding factor and the position in the primary sequence. We have now extended the basic algorithm to permit multiple promoter comparisons using the progressive alignment paradigm. In addition, non-collinear conservation blocks might now be identified in the resulting alignments. We have optimized the parameters of the algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafish orthologous gene promoters.</p> <p>Conclusion</p> <p>Results in this dataset indicate that TF-map alignments are able to detect high-level regulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detected by the typical sequence alignments. Three particular examples are introduced here to illustrate the power of the multiple TF-map alignments to characterize conserved regulatory elements in absence of sequence similarity. We consider this kind of approach can be extremely useful in the future to annotate potential transcription factor binding sites on sets of co-regulated genes from high-throughput expression experiments.</p
SECISaln, a web-based tool for the creation of structure-based alignments of eukaryotic SECIS elements
Summary: Selenoproteins contain the 21st amino acid selenocysteine which is encoded by an inframe UGA codon, usually read as a stop. In eukaryotes, its co-translational recoding requires the presence of an RNA stem–loop structure, the SECIS element in the 3 untranslated region of (UTR) selenoprotein mRNAs. Despite little sequence conservation, SECIS elements share the same overall secondary structure. Until recently, the lack of a significantly high number of selenoprotein mRNA sequences hampered the identification of other potential sequence conservation. In this work, the web-based tool SECISaln provides for the first time an extensive structure-based sequence alignment of SECIS elements resulting from the well-defined secondary structure of the SECIS RNA and the increased size of the eukaryotic selenoproteome. We have used SECISaln to improve our knowledge of SECIS secondary structure and to discover novel, conserved nucleotide positions and we believe it will be a useful tool for the selenoprotein and RNA scientific communities
- …
