Worm Breeder's Gazette 11(1): 36

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

A Collagen Gene 3 kb Upstream of unc-15

Chris Fields

The 12 kilobase sequence containing unc-15, reported by Kagawa et al.
(1989) contains a segment, located approximately 3 kb upstream of 
the initiating ATG of unc-15, that appears to encode a C.  elegans 
collagen.  A single open reading frame extends from base 1408 to 2532 
in this sequence.  This ORF is 52% C+G encodes many Gly-X-Y triplet 
repeats, and displays strong codon usage asymmetries for Pro, Gly, and 
Ala codons that are typical of C.  elegans genes.  The ORF contains 
neither obvious splice sites, nor a region of lower C+G frequency 
suggestive of an intron; it thus appears to encode a single 
uninterrupted exon.  An AATAAA polyadenylation signal occurs at base 
2577 in the sequence; there is no evidence of an inserted polyA tail 
that would suggest that the sequence is a reverse-transcribed 
pseudogene.  The sequence contains a TATAA at base 1461, and a CATCA, 
which may be associated with the start of transcription in collagen 
genes (Cox et al., 1989), at base 1487; these are followed by an ATG 
at base 1507.  These data together suggest that the segment is an 
actual collagen gene containing no introns, rather than a pseudogene.  
This apparent new gene has not yet been named.  Assuming that the 
segment is indeed a collagen gene with an initiating ATG at base 1507, 
it encodes a polypeptide of 342 amino acids.  This polypeptide 
comprises an N-terminal leader of 108 AA, an initial Gly-X-Y domain of 
30 AA, a gap of 15 AA, a second Gly-X-Y domain of 127 AA that includes 
three interruptions in the triplet repeating pattern, and a C-terminal 
tail of 62 AA.  Except for the greater length of the C-terminal tail, 
the lengths of these domains are common to the major size class of C.  
elegans collagens (Cox et al., 1989).  The positions of the cysteine 
residues flanking the Gly-X-Y repeats in this collagen are precisely 
the same as those in col-7, al., 
1989); several of the residues surrounding these conserved cysteines 
are also identical in these four collagens.  col-7, 
expressed primarily in adults (
Cox and Hirsh, 1985); the new collagen may be a member of this 
expression class as well.  col-7 is located within approximately 10 kb 
downstream of unc-15 (Kagawa et al., 1989); col-8 is on LG III (Cox et 
al., 1985).  The location of col-19 is not yet known.