The Helix research group
Research themes
Work in progress and results
Publications
Software and databases
News from Helix
What is bioinformatics? > A short introduction to bioinformatics
Home page
Site map Mail to Helix
The problem of heterogeneous databases
Translated from "Donner un sens au génome", La Recherche, n° 332, June 2000
 

Each database addresses different biological questions, and this shapes the way the data are structured within them. They thus each have a different conceptual plan, so hoping to organise all the genomic data - the sequences and the various other data which are attached to them - within a single database is a lost cause. On the other hand, their integration does need to be improved; in other words it should be made easier to search these different bases at the same time, in response to a complex request from a biologist who has his own method of approaching a problem. This as much a conceptual issue as a technical one. How can different databases be reconciled, when their structure is based on different definitions, (all too often in a way which is not even explicit), especially definitions of such fundamental concepts as the genes themselves? Some databases consider the gene to be limited to those regions of DNA which code for its product or products (protein or RNA) while for others it includes the various regions which come into play during transcription (from DNA into RNA) and translation (from RNA into proteins), that is, a large number of regulatory sequences.

Remember that the term 'genome' is not without ambiguity either. Generally, it refers to the DNA macromolecule contained in the chromosomes, but there is also non-chromosomal DNA, in the plasmids of bacteria and the organelles (mitochondria or chloroplasts for example) of eukaryotic organisms. The term also applies to the whole set of genes of an organism.

 
In the same section
The first genome projects
Whole genome sequencing
Genomic databases
The problem of heterogeneous databases
Searching for homology through similarity of sequences
Finding genes in procaryotic genomes
Finding genes in eucaryotic genomes
Inferring gene functions from homology relationships
The quest for gene fonction has not yet found an algorithmic solution
Modeling and simulating gene interaction networks and metabolic pathways
Biological data and knowlege need to be formalized
 
    Top of page   Home page  Prepare to print