Worm Breeder's Gazette 11(1): 27
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
Information contents of splice sites from a data set of 125 C. elegans introns have been analyzed using the method of Schneider et al. (1986). Regions from 20 bases upstream to 40 bases downstream of 5' splice sites, and from 40 bases upstream to 20 bases downstream of 3' splice sites were analyzed. The distribution of information content as a function of position for the entire data set is very similar to that reported last year (WBG 10(3): 65-66) for a 71 intron data set. C. elegans differs from most higher eukaryotes studied thus far in having many (>50%) introns shorter than 75 bases (Blumenthal and Thomas, 1988; Hawkins, 1988). Figure 1 shows the information contents of splice sites from introns with lengths greater than (upper plot) or less than (lower plot) 75 bases. The information content for each position is calculated as the log(2) of the difference, at that position, of the observed base frequencies from equal base frequencies; the slightly higher information content between the splice sites reflects the preference for A+T of C. elegans introns. The information contents of 3' splice sites of introns in the two size classes are very similar. The information contents of the 5' splice sites are, however, strikingly different. Positions 4, 5, and 6 in the 5' splice sites of long introns encode a total of approximately 1 bit more information than the corresponding positions in short introns. While the significance of this difference is not known, it suggests that the binding constant between U1 snRNP and the 5' splice site may vary significantly as a function of intron length in C. elegans.[See Figure 1]