Worm Breeder's Gazette 12(2): 30 (January 1, 1992)

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

Analysis of the Nematode Caenorhabditis elegans Genome Using Expressed Sequence Tags

W. R. McCombie, J. M. Kelley, C. Fields, M. FitzGerald, J. D. Gocayne, T. Utterback, M. Adams, M. Dubnick, A. Kerlavage, J. C. Venter

Figure 1

Figure 2

Section of Receptor Biochemistry and Molecular Biology, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD 20892 USA.

We have begun to develop a database containing tags of the expressed genes of C. elegans. This will provide useful complementary data for interpreting the genome sequence. We also feel that the creation of such an encyclopedia of expressed genes will allow the C. elegans homologs of important new genes to be rapidly identified.

We have begun by sequencing randomly selected cDNA clones from a directionally cloned library made from mixed stage animals (obtained from Stratagene, Inc. LaJolla, CA). Slightly over 500 unique clones were sequenced from the 3' end of the cDNA. In addition, about 200 of these were also sequenced from the 5' end of the clone. We have used these sequences to search protein and DNA sequence databases at the National Library of Medicine using the BLAST programs (Altschul, S., Gish, W., Miller, W., Myers, E. and Lipman, D. J. Mol. Biol. 215, 403 (1990).) and the NLM BLAST Server. Slightly over 40% of the sequences matched a sequence in the database.

Analysis of these results indicate that these EST sequences contain 110 C. elegans genes that are not in GenBank or PIR but are similar enough to genes from other species to tentatively identify them. We have begun to map these clones, as well as some of the roughly 300 completely unidentified clones using polytene YAC grids (kindly provided by Alan Coulsen). Since this is not a selected set of clones several members of some closely related gene families have been found. The figure below shows the alignment of some of the collagen genes that have been detected. In addition we have isolated several GTP-binding protein genes (an alignment example is shown below.) and tumor suppressor analogs that we are characterizing in more detail. As a further example of this see the article in this issue concerning RNA helicase cDNA's isolated using this approach (Fields and McCombie). We have submitted these sequences to GenBank and they will soon be submitted for publication and to ACEDB. Anyone wishing more information on particular clones available in this set should contact Dick McCombie or Chris Fields at the address above.

[See Figures 1 & 2]

Figure 1

Figure 2