cymobase | drp

Data release policy

All sequence data contained in the CyMoBase has been acquired by manual inspection of genomic DNA, when possible, and of cDNA/EST data. Genomic DNA sequence is available by the generous courtesy of the genome sequencing centers (see list of references for details). We are very greatful to their policy that allows the analysis of their genome data with respect to single gene investigations. The manual inspection of the genomic DNA is absolutely necessary as all known gene prediction algorithms are not able to correctly predict most of the genes, even if they have extensively been trained using cDNA/EST data. Problems are that the EST data is, of course, far from complete, that very small exons (eg exons of 15 bp) are not recognised, that N- and C-termini are not correctly predicted leading to the fusion of neighbouring genes, and that rarely used intron recognition sites (eg. GC-AG) are overseen, just to name a few. However, using the possibilities of comparative genomics most genes can correctly be predicted. This process involves a continuous reinvestigation of the data as soon as genome data from further species is available. Although we do whatever is possible to ensure the high quality of our data, please keep in mind that the data not only increases but also might change for some sequences from version to version. In the Data/Proteins menu you can find a detailed tracking record of all changes.

For technical reasons we will provide a culminated update every Monday.

Newly added sequences should be regarded as unpublished data. If you make use of these sequences, we request you to please read our Guidelines on use of data in publications.

Proteins	42
Sequences	27253
Amino Acids	35949299
Domains	185
Species	1529
Projects	12173
WGS-Projects	5287
Publications	697

General Guidelines

Data release policy

Cymobase Content