Worm Breeder's Gazette 11(1): 36
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
The 12 kilobase sequence containing unc-15, reported by Kagawa et al. (1989) contains a segment, located approximately 3 kb upstream of the initiating ATG of unc-15, that appears to encode a C. elegans collagen. A single open reading frame extends from base 1408 to 2532 in this sequence. This ORF is 52% C+G encodes many Gly-X-Y triplet repeats, and displays strong codon usage asymmetries for Pro, Gly, and Ala codons that are typical of C. elegans genes. The ORF contains neither obvious splice sites, nor a region of lower C+G frequency suggestive of an intron; it thus appears to encode a single uninterrupted exon. An AATAAA polyadenylation signal occurs at base 2577 in the sequence; there is no evidence of an inserted polyA tail that would suggest that the sequence is a reverse-transcribed pseudogene. The sequence contains a TATAA at base 1461, and a CATCA, which may be associated with the start of transcription in collagen genes (Cox et al., 1989), at base 1487; these are followed by an ATG at base 1507. These data together suggest that the segment is an actual collagen gene containing no introns, rather than a pseudogene. This apparent new gene has not yet been named. Assuming that the segment is indeed a collagen gene with an initiating ATG at base 1507, it encodes a polypeptide of 342 amino acids. This polypeptide comprises an N-terminal leader of 108 AA, an initial Gly-X-Y domain of 30 AA, a gap of 15 AA, a second Gly-X-Y domain of 127 AA that includes three interruptions in the triplet repeating pattern, and a C-terminal tail of 62 AA. Except for the greater length of the C-terminal tail, the lengths of these domains are common to the major size class of C. elegans collagens (Cox et al., 1989). The positions of the cysteine residues flanking the Gly-X-Y repeats in this collagen are precisely the same as those in col-7, al., 1989); several of the residues surrounding these conserved cysteines are also identical in these four collagens. col-7, expressed primarily in adults ( Cox and Hirsh, 1985); the new collagen may be a member of this expression class as well. col-7 is located within approximately 10 kb downstream of unc-15 (Kagawa et al., 1989); col-8 is on LG III (Cox et al., 1985). The location of col-19 is not yet known.