Worm Breeder's Gazette 12(3): 27 (June 15, 1992)

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

Expressed Sequence Tags of 1000 Clones from a C. elegans Embryonic CDNA Library

W. R. McCombie, J. M. Kelley, C. Fields, M. Fitzgerald, L. A. Aubin, T. Utterback, M. Adams, M. Dubnick, A. Kerlavage, J. C. Venter

Section of Receptor Biochemistry and Molecular Biology, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD 20892 USA.

We are sequencing small regions of randomly selected cDNAs from C. elegans which will create an encyclopedia of expressed sequence tags from the worm. This will allow the the rapid identification of C. elegans homologs to newly discovered genes from other species. Our group and the C. elegans genome sequencing group have recently published papers describing the sequencing of about 1600 apparently unique EST sequences from the worm (Waterston, et al,Nature Genetics: 114-123 (1992); McCombie, et al, Ibid 124-131 (1992). These sequences were obtained from a mixed stage library.

We have continued to sequence ESTs from the worm and as of this time have sequenced about 1000 additional ESTs from a library constructed from embryonic stage animals. We are in the process of analyzing this latest set of sequences; however, we would like to describe several preliminary observations at this time. The embryonic library contains fewer highly repetitive clones than does the mixed stage library. In addition, these appear to be a somewhat different set of repetitive clones that are found in the mixed stage library. Most interestingly, the embryonic library appears to have a much lower percentage of clones that match something in the protein databases than does the mixed stage library. While it is not possible at this time to rule out library construction biases as a cause of all of these observations, other possible explanations are intriguing. The differences could also be due to the fact that during early development a larger number of unique genes are expressed and fewer of the genes expressed transiently during development are represented in existing sequence databases.

These sequences, plus those previously described (Waterston, et al, Nature Genetics 1: 114-123 (1992); McCombie, et al, Ibid 124-131 (1992), represent about 2500 unique cDNA clones that have been tagged by this method. This is between 10-20% of the genes of C. elegans based on the best estimate of the number of genes that has been made by the genome sequencing group (Waterston, et al, Nature Genetics 1: 114-123 (1992); Sulston, et al, Nature 356: 37-41 (1992)). We are continuing our sequencing effort to generate more ESTs from the worm. We have decided to attempt to minimize redundancy by using several libraries made from stage selected or type selected worms rather than doing extensive selection or subtraction based on one library. This has been successful so far although it may be necessary to change our tactics at some future time.

In addition to sequencing small portions of many cDNAs we have chosen several clones for more detailed analysis. Among these are multiple members of the rab family of proteins and members of the tat binding protein/valosin family of proteins. Progress on the latter of these two projects is described in an additional abstract in this issue of the Gazette (McCombie, et al.). For more detailed information on any of the sequences we have obtained, or to get any of our clones please contact Dick McCombie or Chris Fields. Sequences can also be obtained via anonymous ftp from "briggs.ninds.nih.gov".

Literature Cited:

Waterston, et al,Nature Genetics: 114-123 (1992)

McCombie, et al, Ibid 124-131 (1992)

Sulston, et al, Nature 356: 37-41 (1992)