Worm Breeder's Gazette 11(5): 13
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
The Great Adventure has begun! The joint St. Louis Washington University/Cambridge Laboratory of Molecular Biology project to sequence three megabases of contiguous genomic DNA has been funded under the M.R.C. and N.I.H. Human Genome initiatives. The intention is to develop methods and strategies that will allow us to sequence 1Mb of DNA at each site in the final year of the three year project. It will thus act as a pilot project for the eventual sequencing of the entire genome. There are two elements to our general approach. The first is to reduce the redundancy associated with 'shotgun' sequencing schemes by maximizing the use of directed sequencing with synthetic oligonucleotides. The second is to make data retrieval and analysis more efficient by the use of direct read-out fluorescence-detection sequencing machines (both sites have ABI and Pharmacia devices) and sophisticated computer control of day to day data generation and analysis. We are currently addressing a number of problems: which vectors and insert sizes are most efficient for subcloning cosmids? What degree of shotgun redundancy should be obtained before oligo walking? How do the sequencing machines compare? Can double stranded sequencing yield data of good enough quality? (Molly Craxton has developed some excellent double-stranded sequencing protocols, available on request, which work well on cosmid and lambda DNA, in addition to sub-clones). We have begun to sequence in the cluster on chromosome III. The Cambridge group have started on the lin-9/unc-32 cosmid ZK637 and are proceeding rightwards to the edge of the cluster while the St. Louis group have started sequencing around sup-5 on the cosmid ZK370. Obviously, the genome map is essential in allowing us to take such a logical ordered approach. A 3Mb sequence should extend from the right edge of the cluster leftwards beyond mec-14. It is our intention to release sequence data to the community as rapidly as possible, probably within two months following completion ( of cosmid sized pieces). This will be done by incorporation into the databases currently being developed (see articles by R.D. and J. T-M in this issue and Bruce Schatz et al WBG 11/4, and submission to EMBL/Genbank libraries). Initial sequence analysis (predicted coding regions, splice sites, library comparisons etc.) will also be included.