Worm Breeder's Gazette 13(5): 14 (February 1, 1995)

These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.

Six-Cysteine Motifs in Nematode Proteins.

Mark Blaxter

Wellcome Research Centre for Parasitic Infections, Dept. Biology, Imperial College, London SW7 2BB UK

email m.blaxter@ic.ac.uk

I am interested in the evolution of function in nematode
surface proteins and have been comparing genes isolated
from parasitic species with the C. elegans genome project
sequences. This has identified two cysteine-rich domains
which appear to be nematode specific, and associated with
surface-bound proteins. Each domain has 6 cys residues
with characteristic spacing, suggesting that they are
involved in three disulphide bridges. Such 6-cys domains
are common in vertebrate and invertebrate proteins (eg
EGF repeats, von Willebrand factor repeats7 trefoil repeats)
but the nematode ones are distinct from these. The first
cysteine-rich domain, termed NC6#1 (Nematode Cys 6),
was found by David Gems in two surface coat proteins of Toxocara
canis, an ascarid parasite of dogs (D. Gems and R. Maizels).
It is 36 amino acids long and is found in two copies each in
the T. canis proteins. In C. elegansthere are three genefinder-predicted
genes with such 36 amino acid NC6#1 domains: one is composed
of 5 such domains head tO tail.
The second, NC6#2, is 48-53 aa long and has been identified
in three species:
l : Adults of Syngamus trachea, a strongylid parasite of
the airways of birds reknowned for their huge amphidial
glands (-50% of an adult body length of 10 mm), express an
abundant 650 bp trans-spliced transcript encoding the
cuticular globin. While cloning this I isolated a trans-spliced
cDNA (St8.4) which has two NC6#2 repeats separated by 50
aa rich in G, S, Y and P (total 185 aa). Its function is unknown.
2 :This gene identifies three ORFs in the C. elegans genome
sequence (chr m): B0280.5 (544 aa), R02F2.4 (458 aa) and
C07G2.1 (558 aa). Each gene has NC6#2 repeats: B0280 and
R02F2 have six each, separated by regions made up of 4 aa
(EGSG or ESAG) repeats. C07G2 has three NC6#2 domains separated
by P/T rich subrepetitive regions. None of these genes
appears to have been identified by mutation and their function
is unknown.
3 :Using an alignment of the S. trachea and C. elegans sequences
I searched the db and identified another nematode protein
with a single NC6*2 domain. Brugia malayi, the causative
agent of brugian filariasis (elephantiasis) in humans,
have ensheathed microfilariae (L1 stage). The sheath
is derived from the eggshell and is retained by the mf in
the bloodstream: it is shed on uptake by the mosquito vector.
Juliet Fuhrman identified and cloned a mf surface protein
which by sequence and activity is a chitinase. It is activated
on uptake by the mosquito and may either effect escape from
the sheath or entry through the gut wall. The chitinase
domain of the protein is followed by a T/P rich region and
an NC6#2 domain. The T/P region is the site of O-linked glycosylation
in vivo. J. Fuhrman has identified another (insect) chitinase
with a related C6 domain (Manduca sexta, Swissprot:S64757,
pers comm). Neither the S.trachea nor the C. elegans genes
have extensive non-NC6#2 regions which could be enzymatic
Speculations: Given the association of the chitinase
NC6#2 domain with glycosylation and the presence of Ser
and Thr residues in the non-NC6#2 regions of the other genes
it is tempting to speculate that these too are O-glycosylated
and that the NC6#2 domains are in some way involved in specifying
modification. However, of the NC6#2 genes, only the B.
malayi chitinase has a secretory leader peptide su;,gesting
that the C. elegans and S. trachea proteins are not secreted
and thus unlikely to be glycosylated. One of the T. canis
genes carrying the NC6#1 domains is extensively O-glycosylated
and has a secretory leader (D. Gems and R. Mazels). The role
of the NC6#2 domains might lie in protein-protein interaction
by analogy with the other six-cysteine repeats such as
the EGF domain. In this model the inter-domain segments
may either be structural spacers or effector regions.