Introduction to the Workshop
URLs Provided by Attendees
- Ethical, Legal, and Social Issues
The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.
Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.
Combinatorial algorithms for genome sequence analysis
John D. Kececioglu
Department of Computer Science, University of California, Davis, CA 95616. firstname.lastname@example.org
Our research has focused on algorithms for genome rearrangements, multiple sequence alignment, and DNA sequence assembly. In genome rearrangements we have developed new algorithms for comparing genomes in terms of inversions  and translocations , and for reconstructing an evolutionary history for a set of sequences that have evolved by point mutation and recombination . Given the order and orientation of genes on a chromosome from two related organisms, we can efficiently compute upper and lower bounds on the minimum number of inversions to transform one gene order into the other, and determine the minimum by branch-and-bound for problems involving up to 250 genes. Given gene orders from genomes of two multichromosomal organisms, we can efficiently find an evolutionary history that has a near-minimum number of inversions and translocations; when gene orientation is known, the history is guaranteed to be within a factor of 3/2 of optimal; when orientation is unknown, the history is within a factor of 1/2 of optimal. Given a set of sequences that have evolved by point mutation and recombination, we can efficiently identify within the set a pair of protosequences and a history of recombinations that produces the remaining sequences from the protopair such that the maximum cost of any recombination in the history is minimized; when the set has evolved from more than two protosequences, we can efficiently identify a protoset and a recombination history such that the size of the complement of the protoset is within a factor 1/2 of optimal.
We are also completing two large implementation projects: a branch-and-bound algorithm for maximum-trace multiple sequence alignment, and a three-phase algorithm for DNA sequence assembly. Maximum trace alignment  finds a multiple alignment that is in maximum agreement with a set of candidate pairwise alignments described by dot plots; our algorithm branches on a small set of alignment columns obtained from minimum cuts of a graph, and prunes suboptimal alignments with bounds from a new heuristic. Our three-phase algorithm for sequence assembly uses a decomposition similar to Kececioglu and Myers , but solves the fragment orientation and layout problems simultaneously using a relaxation to nonbipartite matchings.
This work was supported by a DOE Human Genome Postdoctoral Fellowship.
 Kececioglu, John and David Sankoff. "Efficient bounds for oriented chromosome-inversion distance." In Proceedings of the 5th Symposium on Combinatorial Pattern Matching, Asilomar, California, Springer-Verlag Lecture Notes in Computer Science, Volume 807, 307-325, June 1994.
 Kececioglu, John D. and R. Ravi. "Of mice and men: Algorithms for evolutionary distances between genomes with translocation and inversion." Submitted to the 6th ACM-SIAM Symposium on Discrete Algorithms, July 1994.
 Kececioglu, John and Dan Gusfield. "Reconstructing a history of recombinations from a set of sequences." In Proceedings of the 5th ACM-SIAM Symposium on Discrete Algorithms, Arlington, Virginia, 471-480, January 1994.
 Kececioglu, John. "The maximum trace problem in multiple sequence alignment." In Proceedings of the 4th Symposium on Combinatorial Pattern Matching, Padova, Italy, Springer-Verlag Lecture Notes in Computer Science, Volume 684, 106-119, June 1993.
 Kececioglu, John D. and Eugene W. Myers. "Combinatorial algorithms for DNA sequence assembly." To appear in Algorithmica, 1993.