Helix bioinformatics


	Context and situation

	Research activities

	Partnerships

	Teaching activities

	Members

	Former members


	Evolution of species and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from texts


	Evolution of gene and gene families

	Spatial organization of genomic information

	Syntaxic and functionnal genome annotation

	Proteomics

	Modeling and simulation of genetic regulatory networks

	Information extraction from text


	Publications by year

	Publications by author

	Export


	The GenoStar integrated bioinformatics platform for exploratory genomics

	GEB: GenoExpertBacteria

	GNA: Genetic Network Analyzer

	PepLine: high throughput proteomics

	Herbs: checking the consistency of proteome annotations

	ISee: In Silico biology e-learning environment

	BOX: XML specifications of genomic data

	AROM: entity-relationship knowledge modeling


	Software and database releases

	Talks, seminars, poster presentations,...

	PhD and Master thesis defenses

	Training and job opportunities

Publications

A software pipeline dedicated to automatic MS/MS data analysis


		ECCB 2003 (European Conference on Computational Biology), Paris, September 27-30th , 2003 (short paper)

E. Reguer, E. Nugues, R. Cahuzac, M. Ferro, T. Vermat, E. Mouton, J. Garin

With the recent improvements of MS/MS QTOF spectrometers biologists can now generate very large amounts of spectral data (up to 1500 peptides per day) that can no longer be analyzed manually. There is therefore a growing need for computer systems (pipelines) allowing fully automated protein identification from raw MS/MS data. So far, two main approaches have been proposed to this purpose:

1) Direct identification that consists in the comparison of the raw MS/MS spectrum with all entries of a virtual MS/MS spectra database.

2) Indirect identification which involves two successive steps i) MS/MS spectrum interpretation (i.e. determination of amino acid sequences like in the de novo sequencing approach) followed by ii) protein identification from the corresponding peptides.

This paper presents an approach for automatic protein identification dedicated to high-throughput proteomics. This approach follows the line of the indirect protein identification method but, unlike de novo sequencing, does not require the determination of long sequence stretches . It is based on the concept of Protein Sequence Tag (PST). In order to fully exploit this concept, we designed two complementary software modules: Taggor for PSTs generation from spectra (MS/MS data interpretation) and PepMap for PSTs localization on protein or genomic data (protein/gene identification).

Download this ECCB short paper
(PDF, 437.4 ko)