Worm Breeder's Gazette 10(1): 24
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
We determined the DNA sequences of several collagen genes with different expression patterns during development and compared them to the previously sequenced genes col-1 and col-2. The genes chosen for study were col-1 and col-14 which are expressed at varying levels throughout development, col-2 and col-6 which are dauer-specific and col-7, e expressed primarily in animals molting into adults. Each gene is 1.0 to 1.2 kb in length and includes one or two short introns at variable positions. The presumptive promoter regions contain the expected eukaryotic TATA and CAAT sequences. The sequence TAT CTTTCTCTY TTCTTYCT (Y=C or T) is present 30 bp and 74 bp upstream of the CAAT box in col-2 and col-6, respectively. The sequence AAATTT YAYCAATRT TTATT AATT is present 203 and 183 bp upstream of the presumptive CAAT boxes in col-7 and col-19 ( R=A or G; the relevant region of col-8 was not sequenced). The correlation between the presence of these sequences and the similar expression profiles of the relevant genes suggests that these sequences may be involved in the developmental regulation of the genes. The domain structure of the collagen polypeptides is similar to that determined for col-1 and col-2: each polypeptide contains two main triple-helix forming (Gly-X-Y)n domains, one of 30-33 amino acids, and the other of 127-132 amino acids. The latter domain is interrupted by short (2-8 amino acids) non-Gly-X-Y segments in each polypeptide. Sets of cysteine residues flank the (Gly-X-Y)n domains in all of the polypeptides. The genes can be placed into three families based upon structural features (overall protein length and organization of (Gly-X- Y)n domains), positions of cysteine residues and amino acid sequence homologies. The amino acid sequence homologies are most evident in the non-Gly-X-Y domains. As an example, the C-terminal tail sequences are shown below. Col-1 and col-2 comprise one family, col-6 and col- 14 comprise a second family and col-8 and col-19, with the less homologous col-7, comprise the third family. Members of a family can be coordinately regulated as in the case of col-8, ave different expression patterns as in the cases of col-1 and col-2 or col-6 and col-14.The codon usage in all of the genes is highly asymmetrical, with adenine appearing in the third position of 85% of the Gly codons and 93% of the Pro codons.