Worm Breeder's Gazette 11(5): 12
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
We have been writing a computer database system for worm genomic information as part of the St Louis/Cambridge genome project. The idea is to have a flexible mouse driven graphical system that can handle the sequence data and genetic and physical maps, together with as much related material as is easily possible (e.g. the bibliography, gene list etc). For instance, a user can display in separate windows pieces of genetic or physical map, information about a gene, allele or clone, all the items connected with an author, or a paper. The database structure is designed to enable extensive annotation (both structured with keywords, and unstructured with arbitrary comments) and internal cross-referencing. Currently the program is called 'acedb' (pronounced ace-dee-bee, and standing for 'A C. elegans database' ). The main purpose of the system is to provide a standardized, integrated environment for those assembling the genomic information. However, we are also putting quite a lot of effort into making it a nice system to query and extract information from, and will be happy to make read-only versions of the system available in the future to those in the worm community who want them, in the same way that the current physical map program is available. The database will then have to be updated at regular intervals, via email update packages. Once we have a stable version we will also make source code available to anyone interested. We see our program as complementary to the Worm Community System ( WCS) of Schatz et al (WBG 11 n.3, p 6). The 'canonical' versions of the physical and sequence databases will be assembled and managed in acedb. They will be made directly available to the WCS project, so that if you use WCS you will be able to see the same genomic information. Schatz et al. also plan to share community knowledge via annotations, and provide literature via abstracts and page images. The data we currently have available comprise the CGC genetic data and bibliography, the gene list, the physical map from Cambridge, and all worm sequences currently in EMBL. We will also of course have all the sequence generated by the St Louis/Cambridge project (this being one of the original reasons for the database). As well as displaying these data we are working on integrating functions to perform genetic map and sequence calculations. We are able to output information in text or postscript form (for laserprinting). In addition we plan to provide compatibility with the ASN.1 format proposed by the NCBI at the National Library of Medicine for genome database information exchange. We have written acedb from scratch in plain C, rather than using a preexisting database management system like Sybase or Oracle. We made this decision because these relational systems are rather rigid about pre-specifying data structure, are not optimal for long linear data such as sequence or genomic maps, and can not be modified or distributed freely. Acedb runs on Unix workstations under the SunView and (imminently) X windowing systems. With some setting up effort, Macs and PC's can be used as X terminals if connected by ethernet to a UNIX system running acedb. We expect to have a version available for distribution before the next worm meeting.