Combinatorial Aspects of Multiple-Complete-Digest Restriction Mapping

Richard M. Karp and Geoffrey Zweig[1]

Department of Computer Science and Engineering, University of Washington , International Computer Science Institute, Berkeley, CA[1]

Restriction mapping is the process of determining the restriction sites on a target DNA molecule associated with one or more restriction enzymes. Maynard Olson is leading a project at the University of Washington in which restriction maps for two or more restriction enzymes are constructed by the following process:

  1. Construct a clone library giving 15-20X coverage of the target molecule.

  2. Completely digest each clone with each restriction enzyme and measure the fragment sizes using gel electrophoresis.

  3. From this data, reconstruct the overlap structure of the clones and the positions of the restriction sites for each restriction enzyme.

Our research is concerned with two computational aspects of this multiplecomplete-digest restriction mapping problem: determining clone overlaps and fragment identification.

Detecting Clone Overlaps Given a large clone library, the goal is to determine those pairs of clones that are highly likely to overlap. We would like to discover these likely overlaps without explicitly comparing each pair of clones, since such a comparison process would be extremely time-consuming when the number of clones is very large. We have devised a randomized algorithm which finds the highly overlapping pairs without resorting to exhaustive comparisons.[2]

Fragment Identification Once the overlaps among the clones have been constructed various greedy algorithms can be used to determine the ordering of the clones along the DNA. More difficult, however, is the construction of the restriction map given the clone ordering. The central problem is fragment identification - partitioning the fragment occurrences on the individual clones into "equivalence classes," each of which corresponds to a single physical restriction fragment. We have characterized the conditions under which a partitioning of the fragment occurrences is realizable as a physical map. We are currently implementing fragment identification algorithms based on this characterization.

[1] Research supported by DOE Grant 03-94ER61913.000

[2] R.M. Karp, O. Waarts and G. Zweig, Proc. IEEE Symp. on Foundations of Computer Science 621-630 (1995)


Abstracts scanned from text submitted for January 1996 DOE Human Genome Program Contractor-Grantee Workshop.

Return to Table of Contents