Worm Breeder's Gazette 11(1): 13
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
A prototype automated DNA sequence analysis system, gm2 is available to laboratories interested in serving as external test sites. gm2 consists of a set of pattern recognition and statistical analysis modules, together with a geometric modeling system. It accepts as input a DNA sequence, consensus matrices for locating splice sites, translational start sites, and polyadenylation sites, match-quality cutoff values for consensus searches, and base frequency and codon usage standards for coding regions and introns. It produces as output schematic models of the possible genes contained in the sequence that show the locations of the coding sequences, introns, and control signals; it also produces translations of each of the gene models into amino acid sequences. The current version of gm2 generates all possible models of the gene content of a sequence that are consistent with the input parameters. It is capable of analyzing sequences containing partial genes or multiple genes as well as sequences containing a single complete gene. gm2 is implemented entirely in C. It employs a simple, prompt- driven user interface; it can also accept input from a file. It prints output to ASCII files. gm2 can be run from a conventional text- only terminal. The system has been tested on Sun 3 and Sun 4 workstations. It will run on a Sun 3/50 with 4 Mb memory; a larger memory improves performance. The system has been tested on a number of C. elegans sequences in the 10 kb size range, and on composite sequences of up to 40 kb. Complete and correct models of multi-exon genes, e.g. myo-2 and unc-15, can be generated on a Sun 3 in run times ranging from less than 1 minute to roughly 30 min, depending on the search parameters used. Runs on the Sun 4 are approximately four times faster. gm2 is available to university or other nonprofit laboratories, under the condition that they do not redistribute the software. Users of the software will be asked to supply us with results of running gm2, descriptions of problems that are encountered and suggestions for improvements. Source code and documentation for gm2, and for Paper [wbg11.1p11] Abstract [wbg11.1p11] LongText [wbg11.1p11] The CGC produces several different kinds of reference material for C. elegans researchers in addition to providing nematode strains. The following list describes the various items, the formats in which they are available and the date of the last version. Text files on computer diskettes are organized very simply and can easily be used with dBase and word processor programs on a variety of microcomputers. The information in the computer files is updated weekly or monthly. Paper lists typically order information in a way that reduces the need to have it on a computer and they are updated annually or biannually. All items are available on request. Letters on departmental letterhead should be addressed to Mark Edgley at the CGC (see address in the subscriber list at the back of this issue). Requests for computer text files must be accompanied by appropriate blank diskettes and information about the system and programs with which the data will be used (call Mark to find out the current size of each file). All disk files come with a description of data organization and some brief instructions for use. Paper lists may temporarily be unavailable if we have run out of copies and an update is in process. Strain List: All strains available from the CGC, giving strain name and genotype. The paper version is automatically sent to every laboratory with CGC strain and allele designations. It contains strains in order by genotype and the disk version contains them in order by strain name. Last paper version: March, 1988. Updates appear regularly in the WBG. Bibliography: All articles and book chapters on C. elegans and C. briggsae from 1866 through the present. The paper version (also automatically sent to all CGC labs) comes in two parts. The first covers 1866 through 1985 and the second covers everything since 1985. The first part is not updated, but the smaller second part is updated biannually. When the second part is as large as the first, a single list will again be generated. Each part is composed of three sections: (1) the complete list in order by first author; (2) an abbreviated list in order by CGC key number; and (3) articles grouped by keyword. The disk version contains articles in order by key number, first author or journal (specify when you ask for it; the default is key number order). Last paper version: March, 1988. Updates appear regularly in the WBG. Map Data: All genetic mapping crosses considered in generating the C. elegans genetic map. The paper version is now only available as a special request item to laboratories doing genetic mapping, since it is too expensive to produce and mail routinely to a large number of laboratories (see the blurb in the Announcements section of this Gazette). The printout is in three sections: (1) Two-factor distance data; (2) deficiency/duplication complementation data; and (3) multi- factor ordering data. In each section, the entries are ordered by gene or rearrangement name. The disk version contains entries in order by cross number. Last paper version: June, 1988 update. The disk files are updated during each map revision and are available shortly after the revision is published. Map Drawing: The computer drawing files for all genetic map sections are available for use on your own system. The drawing is produced using the program 'Designer' (Micrografx, Inc., Richardson, Texas), which runs under Microsoft Windows on IBM-compatible microcomputers, with the sections formatted for printing on an Apple LaserWriter Plus (other printers may not have available the line widths and fonts we use). You have to supply your own copy of Designer or other program that can read its drawing files. Conversion programs are available from Micrografx to make the drawings usable in Autocad, PageMaker, Harvard Graphics, Ventura Publisher, Freelance, Draw Plus, Graph Plus, WordPerfect and PC Paintbrush. These conversions are not perfect; some print attributes and image definition may be lost in translation and some programs do not allow editing. Generally, the more sophisticated the program, the better the quality of the converted image. The people at Micrografx are working on a program to convert drawings to Macintosh formats, but it is not yet available. We have used Macintosh Freehand to open and print chromosome sections, but were not able to use it for editing. Last version: May, 1989, except for the left end of LG III, which is included with this Gazette. WBG Subscribers: The complete list of subscribers with addresses, phone numbers, FAX numbers and BITNET addresses is printed in the first issue of each volume of the Gazette and updates to the list appear in each subsequent issue. The list is available as a computer disk file with the entries in order by last name. WBG Tables of: The Tables of Contents of most WBG issues (back to the first one) are available on diskette as rather crude, and in places, incomplete text files. They include titles, authors, volume and issue numbers and page numbers. Films: The CGC owns two short 16mm films on C. elegans that are available for loan. The first is the Encyclopaedia Britannica film 'Nematode', an 11-minute introduction to worm behavior and mutants using dictionary entries, music and toys for illustration. The second is 'Embryonic Development of the Nematode Caenorhabditis nstitut f r den Wissenschaftlichen Film, also about 11-minutes long. It is narrated time-lapse Nomarski photography of a developing embryo from fertilization through hatching, with a computer reconstruction of the embryo that rotates about its longitudinal axis to show relative positions of the nuclei. Requests should be made well in advance of the date you want the films (one month is good), and it's a good idea to call first to make sure they are not already out on loan.