Worm Breeder's Gazette 10(2): 73

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

Duplication and Gene Conversion of Collagen Genes in C. elegans

Yang-Seo Cho and James M. Kramer

Collagen genes in C.  elegans represent a large multigene family 
with 50-150 members.  Sequencing analysis indicates that collagen 
genes of C.  elegans share a common gene structure.  Although they 
share the common gene structure the members of the gene family exhibit 
a wide range of sequence similarity, ranging from 50% to 99.5%.  
Unlike most other large multigene families, most members of the 
collagen gene family are dispersed in the C.  elegans genome.  Thus, 
collagen genes of C.  elegans are of interest for studying mechanisms 
involved in molecular evolution.
Genomic Southern hybridization experiments indicate that genes near 
to each other have greater sequence similarity than do those more 
distant.  The sequences of two genes separated by 1.5 kb, col-12 and 
col-13, have been determined.  Only 5 nucleotide changes are observed 
in the 951 nucleotides of the coding region (99.5 % identity).  Amino 
acid sequence comparisons reveal that the coding regions are identical 
except for 2 amino acids at the amino terminus.  The intron sequences (
52 nucleotides) are identical, however, the 5' and 3' untranslated 
regions are strikingly different from each other (less than 50% 
identity).
To determine when these two genes are expressed Northern blot 
hybridization experiments have been done with col-12 and col-13 
specific probes from the 3' untranslated region.  The col-12 specific 
probe hybridizes to a 1.2 kb transcript and the col-13 probe 
hybridizes to a 1.3 kb transcript.  Both of the transcripts are 
detected at similar levels in L4 and adult molt RNAs, but are not 
found in embryo and dauer molt RNAs.  The size difference in the 
transcripts is likely to be due to the difference in the length of the 
3' untranslated regions, since a consensus poly(A) addition signal for 
col-12 is identified 87 nucleotides downstream from the translational 
termination codon but no poly(A) addition signal for col-13 has been 
identified.  However, several possible poly(A) addition signals that 
differ by 2 nucleotides from the consensus signal are located 150-200 
nucleotides downstream from the termination codon in col-13.  The 
results show that both of these genes are expressed at the same 
developmental stages.  It is interesting that the 5' flanking regions 
of these genes are divergent yet they display the same mode of 
regulation during development.
These data suggest that col-12 and col-13 are derived from a gene 
duplication and that the coding regions have been conserved by gene 
conversion.  It is an unusual observation that gene conversion has 
maintained sequence similarity only from the translational start codon 
to the termination codon.