Introduction to the Workshop
URLs Provided by Attendees
- Ethical, Legal, and Social Issues
The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.
Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.
Bayes Decoding of Pooling Experiments Using Monte Carlo Methods
W.J. Bruno, E. Knill[1,2], A. Schliep, and D.C. Torney
 Theoretical Biology and Biophysics Group; T10, MS K710; Los Alamos National Laboratory; Los Alamos, New Mexico 87545.  Computer Research and Applications; CIC3, MS B265; Los Alamos National Laboratory; Los Alamos, New Mexico 87545.
One of the most important experimental procedures used in physical mapping is to screen a library of clones for clones which are positive for a given probe. It is usually not feasible to screen each clone individually. To reduce the number of screening tests, one can form pools from sets of clones and attempt to determine the positive clones by screening the pools. This is the procedure used by most laboratories.
Pooling strategies are designed so that the positive clones can be determined from the screening results for the pools, provided that the number of positive clones is small enough and the pool screenings have no errors. However, we have found that in practice, most pool results are ambiguous due to larger numbers of positive clones and errors. In order to find the positive clones, additional screenings of individual clones are required. To minimize the number of additional screenings requires interpreting the pool results. For simple pooling strategies such as the row-and-column strategies, interpretation of the pool results can be done by hand with reasonable success. These strategies are however far from optimally efficient in terms of the number of pools used.
The ideal method for interpreting pool results involves computing the posterior probabilities of each set of clones being the set of positive clones. We can then determine each clone's posterior probability of being positive. This can be achieved for screening clone libraries because there is a reasonable model for the distribution of sets of positive clones and experimental errors. The main difficulty in implementing this method is that the exact computation of the posterior probabilites is not feasible. However, one can use one of several Monte Carlo methods for estimating these probabilites. The two most promising methods are the Gibbs and the Metropolis-Hastings methods. In these methods, one establishes a Markov process, whose nodes are possible sets of positive clones and whose transition probabilities are determined by the relative posterior probabilites of "adjacent" states. Adjacency is determined by removing or adding a single clone. The posterior probabilty of a clone's being positive is estimated by simulating the Markov process and determining the fraction of the time that a clone is a member of the current state.
We are integrating these algorithms in a general decoding system we have developed and implemented in software. Currently this decoding system uses one of several heuristics to prerank the clones. The heuristics are based on the number of positive pools that a clone belongs to, and whether a positive pool is uniquely explained by a given clone's being positive.