Helix bioinformatics


	Context and situation

	Research activities

	Partnerships

	Teaching activities

	Members

	Former members


	Evolution of species and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from texts


	Evolution of gene and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from text


	Publications by year

	Publications by author

	Export


	The GenoStar integrated bioinformatics platform for exploratory genomics

	GEB: GenoExpertBacteria

	GNA: Genetic Network Analyzer

	PepLine: high throughput proteomics

	Herbs: checking the consistency of proteome annotations

	ISee: In Silico biology e-learning environment

	BOX: XML specifications of genomic data

	AROM: entity-relationship knowledge modeling


	Software and database releases

	Talks, seminars, poster presentations,...

	PhD and Master thesis defenses

	Training and job opportunities

What is bioinformatics? > A series of five pedagogical papers on bioinformatics

In silico annotation of genomic sequences


		L'annotation in silico des séquences génomiques

Médecine / Sciences 2002 ; 18 : 237-250

Claudine Médigue, Stéphanie Bocs, Laurent Labarre, Catherine Mathé, David Vallenet

Abstract: For the first time in history, we have access to the entire genetic content of a growing number and variety of living organisms. This explosive growth of information is forcing changes in many scientific disciplines, particularly in computational biology and molecular genetics. One of the challenges is to predict and annotate the functions of the gene products as rapidly and completely as possible, taking into account both molecular interactions and higher cellular order processes. The first level of sequence annotation consists in gene finding and functional prediction of their products using similarities searching in protein databanks. This step remains easier in the context of procaryotic genome analysis, the gene structure of these organisms being much more simple than the one of eucaryotes. Predicting function from sequence using computational tools is generally done for each gene individually. Others levels of annotation, such as the identification of interactions between genomic elements characterized in the first step, are more difficult to achieve. If we currently best described the protein function in the context of molecular interactions, it will be possible in the near future to predict function in the context of higher order processes such as the regulation of gene expression, metabolic pathways and signaling cascades. Besides the information from the completely sequenced genomes, the latter analysis also uses additional information from proteomics and expression data. New infrastructures that integrate various levels of sequence annotation and function prediction are clearly required. This paper focuses on the various facets of the in silico sequence annotation, which is far from being perfect despite the fact that sequencing itself is highly automated and accurate, and despite the fact that (or maybe because…) sequence information is described in simple linear form, using a four-letter alphabet. There remains a long way to go until we are able to describe molecular processes quantitatively. However, there is no doubt that in silico sequence analysis is extremely powerful, and the generation of hypothesis derived by computational methods will be more and more often the first successful step in the design of in vivo/in vitro experiments.

Download the full text of the paper (in French)
(PDF, 157.7 ko)

	In silico annotation of genomic sequences
	Biological data and knowledge modeling
	Modelling, analysis and simulation of gene networks
	Comparative genomic mapping in mammals
	Molecular phylogeny and evolution