Worm Breeder's Gazette 14(4): 13 (October 1, 1996)
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
Genome Sequencing Center, Washington University School of Medicine, St. Louis, MO, USA and The Sanger Centre, Hinxton Hall, Cambridge, UK
Just over 52 megabases of finished C. elegans genomic sequence from over 1700 cosmids have been completed by the Consortium with the breakdown by chromosome as follows: I. 4.5 Mb, II. 8.8 Mb, III. 7.3 M, IV. 7.1 Mb, V. 9.1 Mb, X. 14.9 Mb Most of the gene-rich regions of the genome represented in cosmids (~80%) are either finished or in some stage of library construction or production. Of the remaining ~20% preliminary data suggests that some may be represented in a fosmid library (see the article; "C. elegans and C. briggsae arrayed DNA libraries" in this issue). Techniques are being developed to retrieve the remaining sequence from YAC subclones. The number of predicted proteins in the finished sequence is around 9700 of which 47% have significant similarity to genes from other organisms. The WORMPEP database contains nearly 7300 of the predicted proteins and is retrievable by ftp from the Sanger Centre. The number of tRNAs is now almost 300. The proportion of predicted genes having one or more EST sequences is 32%, thus confirming they are real genes. In addition to C. elegans selected C. briggsae clones are also being sequenced. To date, 8 briggsae clones (8 fosmids and 3 cosmids) have been finished with another 46 fosmids in various stages of production. As many as 20 megabases of the C. briggsae genome eventually may be sequenced as part of the C. elegans Genome Project. Plans are being made to include the C. briggsae sequence in future ACEDB releases. The C. briggsae sequences are available by ftp from the Genome Sequencing Center in St. Louis. All of the C. elegans sequence data is available after it completes the initial "shotgun" and assembly phases of sequencing via anonymous ftp and the World Wide Web from the Sanger Centre and the Genome Sequencing Center. Each site contains only their own unfinished sequence. Both sites now provide on-line searching capabilities of finished and unfinished sequences from the respective site. We actively curate the sequence, and would like to hear from you when you determine correct gene structure from cDNA data, or if you think you have found a sequence error. For further information on the C. elegans gene predictions and annotations from the sequencing project contact John Spieth (email@example.com) or Steve Jones (firstname.lastname@example.org). For information on the distribution of ACEDB contact Richard Durbin (email@example.com) or Jean Thierry-Mieg (firstname.lastname@example.org). For information on sequencing plans or estimated completion times contact Richard Wilson (email@example.com) or Alan Coulson (firstname.lastname@example.org). All requests for cosmid clones should be sent to Alan Coulson (email@example.com). The ftp and WWW sites for St. Louis and the Sanger Centre are: St. Louis: ftp:genome.wustl.edu (directory:/pub/gsc1/sequence/st.louis/elegans) WWW: http://genome.wustl.edu/gsc/gschmpg.html Sanger Centre: ftp: ftp.sanger.ac.uk (directory /pub/databases/C.elegans_sequences) WWW: http://www.sanger.ac.uk/ ACEDB data releases can be obtained from: Ncbi.nlm.nih.gov (22.214.171.124) in the USA, in repository/acedb ftp.sanger.ac.uk (126.96.36.199) in the UK, in pub/acedb lirmm.lirmm.fr (188.8.131.52) in France, in genome/acedb