The Chromosome 16 Physical Map: Ordering Quality Statistics

E. Knill

Computer Research and Applications; CIC-3, MS B265, Los Alamos National Laboratory, Los Alamos, New Mexico 87545; Theoretical Biology and Biophysics Group; T10, MS K710, Los Alamos National Laboratory, Los Alamos, New Mexico 87545.

An important component of recently constructed physical maps is the ordering of STS's and other types of markers. The ordering is based on results from screening the markers against libraries of clones such as YACs and MegaYACs and localization in "bins" with tools such as somatic cell hybrid panels. The ordering of the markers usually cannot be determined uniquely from these experiments. There are two main sources of ordering ambiguities. The first is due to incomplete data. For example, markers with the same screening results cannot be distinguished. This type of ambiguity is intrinsic to the ordering problem and can be described by the use of a data structure called the PQ-tree.[3] The second source of ordering ambiguities comes from errors in the data, chimerism and hits attributable to repeated sequences. We refer to these collectively (and somewhat misleadingly) as errors. The purpose of this work is to give information about the second source of ambiguity.

This work contains an overview and explanation of some simple ordering quality statistics for the ordering of markers in the recently completed chromosome 16 physical map.[2] These statistics include information on changes in the number of "gaps" if adjacent sets of markers are interchanged and the number of times adjacent or nearby markers are linked by MegaYACs. The chromosome 16 physical map includes position information which we used for computing data on multiply linked pairs of markers. The position information was inferred primarily by the use of SEGMAP.[4]

Because of the lack of a general agreement on or an understanding of how to best describe local ordering reliability for physical maps, we do not attempt to formally interpret the data at this time. Instead, we simply define the statistics that are presented, describe how they were obtained and tabulate them. By doing so we hope to stimulate further discussion of the issues involved and to enable future analyses of reliability information.

*This work was performed under the auspices of the U.S. Department of Energy under Contract No. W-7405-ENG-36.

[1] E. Knill, The Chromosome 16 Physical Map: Ordering Quality Statistics, Los Alamos National Laboratory Report LAUR-95-2924.

[2] N.A. Doggett et al. An Integrated Map of Human Chromosome 16, to appear in Nature (1995).

[3] K. Booth and G. S. Lueker, Journal of Computer and System Sciences, 13:335-379, 1976.

[4] E. D. Green and P. Green. PCR Meth. Appl., 1:77-90, 1991.


Abstracts scanned from text submitted for January 1996 DOE Human Genome Program Contractor-Grantee Workshop.

Return to Table of Contents