Worm Breeder's Gazette 14(1): 17 (October 1, 1995)
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri, USA and Sanger Centre, Hinxton Hall, Cambridge, UK.
A total of 22.5 megabases of sequence from 732 clones has been finished to date by the Consortium with the following breakdown by chromosome: II=6.7 Mb III=7.2 Mb IV=0.3 Mb X=8.3 Mb An additional 15 Mb of sequence data is in various stages of completion bringing the total sequence available to more than 37.5 Mb. The gene-rich regions of chromosomes II and III are complete with the exception of some gaps where cosmids were not available (Figure 1). While we are rescuing these regions, we will continue to sequence the gene-rich regions of other chromosomes. Sequencing on chromosome X is well advanced and some cosmids are now finished on chromosome IV. We have begun library construction and some shotgun on chromosome V and will move finally to chromosome I. Efforts are also underway to develop strategies for sequencing the gene- poor regions of each of the chromosomes. Currently, approximately 46% of all predicted genes have significant database similarities. The current prediction for total gene number in C. elegans is 13526 (+/-500). The Consortium provides preliminary sequence data for all clones currently in production whether they are partially finished, finished but not yet fully annotated, or submitted to GenBank/EMBL. Sequences for those clones which are started but not yet submitted are provided with the caveat that they are preliminary and often contain errors. It should also be noted that the segment submitted from a cosmid will often not correspond to the full insert. However, the information about the actual start and end of the cosmid insert sequence (starting and ending positions within the neighboring cosmids) is available in ACeDB and in the GenBank/EMBL submissions. All sequences which have been submitted to GenBank/EMBL, or finished but not yet fully annotated are also available in ACeDB data releases obtained via anonymous ftp from: ncbi.nlm.nih.gov (130.14.20.1) in the USA, in repository/acedb ftp.sanger.ac.uk in England, in pub/acedb lirmm.lirmm.fr (193.49.104.10) in France, in genome/acedb In ACeDB, cosmids which which have not yet been manually reviewed and fully annotated are denoted with gene predictions labeled with COSMID_NAME.alphabetic_character. Those which have been fully annotated are indicated by gene predictions labeled with COSMID_NAME.digit. The ftp and web sites for St. Louis and the Sanger Centre are: St. Louis: ftp: genome.wustl.edu (directory:/pub/gsc1/sequence/st.louis/elegans) www: http://genome.wustl.edu/gsc/gschmpg.html Sanger Centre: ftp: ftp.sanger.ac.uk (directory:/pub/C.elegans_sequences) www: http://www.sanger.ac.uk/ For ftp'ing, log in as user "anonymous" and give a user identifier as password. For connection to the WWW sites, MOSAIC or NETSCAPE can be used to open the URLs listed above. A variety of information is available at one or both WWW/ftp sites including software used in the project, acedb documentation, personnel information, cosmid sequences, lists of cosmids in map order providing overlap information between the submitted sequences, and weekly lists of protein and cDNA similarities for cosmids which were finished that week. At the Sanger Centre WWW site, services are available to BLAST your sequence against all cosmids currently in production. This service will soon be implemented in St. Louis as well. For further information on C. elegans gene predictions and annotation from the sequencing projects, please contact John Spieth (jspieth@watson.wustl.edu) or Steve Jones (sjj@sanger.ac.uk). For information on sequencing plans or estimated completion times, please contact Richard Wilson (rwilson@watson.wustl.edu) or Alan Coulson (alan@sanger.ac.uk). For additional information on the distribution of ACeDB, please contact Richard Durbin (rd@sanger.ac.uk) or Jean Thierry- Mieg (mieg@kaa.cnrs-mop.fr). All requests for cosmid clones should be addressed to Alan Coulson. For further information about the C. elegans genome project including our policy statement about sharing both data and sequencing expertise, please contact Richard Wilson. See WBG for figure.