Worm Breeder's Gazette 11(5): 13

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

Genome Sequencing

Sulston, Coulson, Craxton, Shownkeen, Hawkins, Metzstein, Ainscough, Staden, Durbin, Thierry-Mieg, Waterston, Wilson, Lutterbach, Kozono, Du, Qui, Green and Weigelt

The Great Adventure has begun! 
The joint St.  Louis Washington University/Cambridge Laboratory of 
Molecular Biology project to sequence three megabases of contiguous 
genomic DNA has been funded under the M.R.C.  and N.I.H.  Human Genome 
initiatives.  The intention is to develop methods and strategies that 
will allow us to sequence 1Mb of DNA at each site in the final year of 
the three year project.  It will thus act as a pilot project for the 
eventual sequencing of the entire genome.  
There are two elements to our general approach.  The first is to 
reduce the redundancy associated with 'shotgun' sequencing schemes by 
maximizing the use of directed sequencing with synthetic 
oligonucleotides.  The second is to make data retrieval and analysis 
more efficient by the use of direct read-out fluorescence-detection 
sequencing machines (both sites have ABI and Pharmacia devices) and 
sophisticated computer control of day to day data generation and 
analysis.  We are currently addressing a number of problems: which 
vectors and insert sizes are most efficient for subcloning cosmids? 
What degree of shotgun redundancy should be obtained before oligo 
walking? How do the sequencing machines compare? Can double stranded 
sequencing yield data of good enough quality? (Molly Craxton has 
developed some excellent double-stranded sequencing protocols, 
available on request, which work well on cosmid and lambda DNA, in 
addition to sub-clones).  
We have begun to sequence in the cluster on chromosome III.  The 
Cambridge group have started on the lin-9/unc-32 cosmid ZK637 and are 
proceeding rightwards to the edge of the cluster while the St.  Louis 
group have started sequencing around sup-5 on the cosmid ZK370.  
Obviously, the genome map is essential in allowing us to take such a 
logical ordered approach.  A 3Mb sequence should extend from the right 
edge of the cluster leftwards beyond mec-14.  
It is our intention to release sequence data to the community as 
rapidly as possible, probably within two months following completion (
of cosmid sized pieces).  This will be done by incorporation into the 
databases currently being developed (see articles by R.D.  and J.  T-M 
in this issue and Bruce Schatz et al WBG 11/4, and submission to 
EMBL/Genbank libraries).  Initial sequence analysis (predicted coding 
regions, splice sites, library comparisons etc.) will also be included.