Helix bioinformatics


	Context and situation

	Research activities

	Partnerships

	Teaching activities

	Members

	Former members


	Evolution of species and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from texts


	Evolution of gene and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from text


	Publications by year

	Publications by author

	Export


	The GenoStar integrated bioinformatics platform for exploratory genomics

	GEB: GenoExpertBacteria

	GNA: Genetic Network Analyzer

	PepLine: high throughput proteomics

	Herbs: checking the consistency of proteome annotations

	ISee: In Silico biology e-learning environment

	BOX: XML specifications of genomic data

	AROM: entity-relationship knowledge modeling


	Software and database releases

	Talks, seminars, poster presentations,...

	PhD and Master thesis defenses

	Training and job opportunities

What is bioinformatics? > A short introduction to bioinformatics

The quest for gene fonction has not yet found an algorithmic solution

Another problem in the quest for gene functions stems from the fact that the merging of fragments originating from different genes allows totally new functions to emerge. This is what François Jacob meant by "evolutionary tinkering". In addition to these problems linked to the way living systems function, there are others which arise from the fact that the available sequence databases are incomplete and contain errors. For these reasons, the results produced by software are no more than hypotheses, which must in turn be experimentally tested in the laboratory, in particular by observing the effects of the substitution or deletion of a gene in the organism, or in one related to it. This is why the priority given to the human genome has sometimes been criticised. Some think it would be better to begin by sequencing and analysing the mouse genome, which has large numbers of genes homologous with human genes, and which can be experimented on, rather than to tackle the human genome straight away, with the risk of accumulating hypotheses which cannot be validated in the short term. Whatever the answer, given the inadequacy of a purely computational approach, determining the function of genes (or rather of the proteins they code for) is now a matter for the experts. As soon as the drosophila genome had been sequenced, Craig Venter hosted what he called a "jamboree" for forty-five of the world's top specialists in fly genetics, bio-informatics and proteins, where they spent eleven days comparing their opinions on the raw sequence he had just obtained in collaboration with more than thirty teams around the world. It was only after this brain-storming session that an annotated sequence was submitted to the rest of the scientific community, and published in the journal Science. Clearly, systematising this "annotation" process is a considerable challenge for bioinformatics. Once we think we have identified the sequence of a gene, what is the best way to fit together data and knowledge of various kinds and various origins, relating to several organisms, in order to predict the functions of that gene?

In the "anything goes" strategy, one key element is the way data and information are structured within computer systems, whose powerful capabilities allow the researcher to search and browse, to visualise data from a different perspective, and thus to draw new inferences. Although it is easy to store basic data such as sequences, the computational representation of data about functions, for example those which relate to metabolic pathways, is still a problem for bioinformatics research. A look at the KEGG database will confirm this - here, the data are only presented as images, available "at a click", certainly, but impossible to process using software.

	The first genome projects
	Whole genome sequencing
	Genomic databases
	The problem of heterogeneous databases
	Searching for homology through similarity of sequences
	Finding genes in procaryotic genomes
	Finding genes in eucaryotic genomes
	Inferring gene functions from homology relationships
	The quest for gene fonction has not yet found an algorithmic solution
	Modeling and simulating gene interaction networks and metabolic pathways
	Biological data and knowlege need to be formalized