Spontaneous mutations provide a basis for natural variation and subsequent selection. While recently working with dominant gain-of-function unc-105 mutants we isolated two strains with improved population growth rates. One strain looked reasonably wild-type and the other twitched. Based upon past work with spontaneous suppressors of unc-105 we suspected these strains contained mutations in unc-105 and unc-22. Much to our surprise, initial cost estimates for sequencing these two genes were not much less than next generation sequencing of the full genome; unc-22 is a large gene. Given that full genome sequencing would identify mutations other than just in these two genes, we opted to determine if full genome sequencing could be used to identify our spontaneous mutants. We used Illumina Solexa sequencing and aligned reads to the C. elegans WS200 genome. Based on our read depth analysis, we achieved about 60x, 40x, and 10x coverage for the parental unc-105 strain and strains CC24 and CC50, containing alleles xg1 and xg2 respectively. As the mutations arose in the parental strain, we subtracted the parental genomic sequence from each of the two derived strains in order to identify our mutations of interest. We began by identifying all the homozygous (>90% read support) single nucleotide variants (SNVs) and small insertions and/or deletions (indels) in each genome versus the C. elegans WS200 reference genome (>12,000 mutations in total). We next identified SNVs and indels that are common to all three genomes and removed these from consideration as they represent mutations already present in the parental strain and therefore not our spontaneous mutations. This simple step left us with 59 unique SNVs and indels present in the genome of CC24 and 25 unique SNVs and indels in the genome of CC50. Based on these observations and the date upon which the spontaneous mutations arose, the mutational rate in unc-105(n490) background is between 3.1-3.5 per month (or roughly 0.8-0.9 per generation) which is consistent with past estimates of the mutational rate in C. elegans (≈0.2-0.9 per generation). The total number of each SNV and indel type is summarized in Table 1. While most of the SNVs are silent mutations (mutations in introns or intergenic regions), we observed one nonsense mutation and two frame shifting indels in CC24 and one nonsense mutation in CC50. By inspecting the genes that these mutations affect, we identified a 4bp insertion as xg1, disrupting unc-105 at II:8118943 and the nonsense mutation xg2 disrupted unc-22 at IV:11984751. Thus, whole genome sequencing confirmed our supposition that xg1 and xg2 were alleles of unc-105 and unc-22, respectively. We have demonstrated that for spontaneously arising mutations that are easy to score, outcrosses are not needed and that rather, it is possible to quickly and easily identify spontaneous mutants of interest by utilizing whole genome sequencing of parental and laboratory evolved strains coupled with subtractive analysis. In the case of xg1, three candidate mutations were identified, whereas for xg2 only one was identified. As the strain containing xg2 was sequenced much closer to the date of the spontaneous mutation appearing (7 vs. 19 months) it is likely that sequencing close to the point of isolation will allow future use of our methods in experimental evolution studies to similarly identify a suppressor of interest even in the absence of a hypothesis as to the identity of the gene. This methodology should prove an important tool for prospective experimental evolution studies in a multicellular animal.
Table 1: Summary of SNVs and Indels in CC24 and CC50
|Strain||Synonymous||Silent||Missense||Nonsense||Frame preserving indel||Frame shifting indel|
Freya Shephard and Jeff Chu contributed equally to this work.