The Genome Sequence of Methanobacterium Thermoautotrophicum*

Douglas R. Smith, Lynn Doucette-Stamm, Hong Mei Lee, Joanne Dubois, Craig Deloughery, Tyler Aldredge, Romina Bashirzadeh, Derron Blakely, Wendy Caubet, Maria Chung, Katie Gilbert, Chenghua Ma, Pamela Parenteau, Rupal Patel, Dayong Qui, Skip Shimer, Xing Wang, Jamey Wierzbowski, Jork Nolling[1] and John Reeve[1]

Genome Therapeutics Corp., 100 Beaver St. Waltham, MA 02154

The goal of this project is to sequence the genomes of microbes which may be useful for energy production and bioremediation of toxic wastes. In related projects we are sequencing the genomes of bacterial pathogens and regions of human chromosome 10. The sequencing is being done by computer-assisted[2] multiplex sequencing techniques which are under active development through an NIH-funded Genome Science and Technology Center.

Our initial focus has been the genome of the archaeon Methanobacterium thermoautotrophicum (1.7 Mb), which is ubiquitous in anaerobic environments and is potentially useful for the production of methane from biowastes. The organism can be readily grown and manipulated under laboratory conditions. The sequencing was done by a whole-genome shotgun approach with 2 kb plasmid subclones. Pools of templates were sequenced by chemical and enzymatic cycled-sequencing methods and run on direct transfer electrophoresis gels. A set of 67 nylon membranes containing reactions from 30,720 templates (1536 pools of 20 clones) were sequentially probed to generate films that were then scanned into a computer system. Approximately 13 Mb of raw data were generated (7.5 genome equivalents). The data was assembled into contigs using the program PHRAP and primers were automatically selected from the ends of the contigs using the program AUTOPRIMER. Dye-terminator finishing reactions were then performed and the samples run on ABI 377 machines. The data from the finishing reactions was then re-assembled together with the original data into approximately half the original number of contigs. This process was then reiterated to further reduce the number of contigs.

A preliminary analysis for database homologies was performed using the program BLAST on individual sequence reads. In-depth analysis will commence once sufficiently large contigs have been generated and have been proofread to reduce the occurrence of indels. The sequences will be made available with full annotation for gene locations as soon as possible after completion. The group at Ohio State University has provided starting DNA and cosmids, and is assisting in the analysis of the data.

[1] The Ohio State University, 484 W. 12th Ave., Columbus, OH, 43210

[2] automated image analysis and sequence reading for multiplex sequencing is currently done using REPLICA[TM], a program developed by Mintz and Church at HHMI at Harvard Medical School

*Supported under the Microbial Genome Program by Grant No. DE-FC02-95ER61967 from the Office of Health and Environmental Research of the US. Department of Energy


Abstracts scanned from text submitted for January 1996 DOE Human Genome Program Contractor-Grantee Workshop.

Return to Table of Contents