|Genome Sequencing Section
DOE Human Genome Program Contractor-Grantee Workshop
|7. The SaF Finishing
Matt P. Nolan, Jane E. Lamerdin,
Glenda G. Quan, and Anthony V. Carrano
Our modified shotgun sequencing effort has three phases. In the random phase we sequence a fixed number of plates resulting in 80%-95% of the cosmid bases meeting our quality-based, double-stranded, finish criteria (QbDsFc). During pre-finishing we resequence clones attempting in one round of forwards and reverses to meet the QbDsFc for 95% of the bases and close most gaps. During directed closure we close any remaining gaps and complete double-stranding. To reduce finishing costs and speed time to completion for our cosmid and BAC clone projects we created software to automate selection of finishing reads. We describe our SaF (Swedish and Finnish) software tools developed to 1) facilitate the specification of clones for resequencing and to 2) quantify the state of project contigs with respect to our QbDsFc.
We describe improvements to the SaF tools that helped us meet our ten-fold increase in sequence produced in the past year. In our production sequencing we use the SaF tools to fully automate clone selection in the pre-finishing phase and we require finishers to address each region identified during directed closure.
For a project assemblage our SaF tools identify bases not meeting the QbDsFc, then conglomerate these problem bases into problem regions using parameterized filtering and clustering algorithms. They produce reports listing each problem region and a contig summary.
In prefinishing we are attempting to identify candidate clones for the creation of shatter libraries. Some simple improvements to our algorithm have helped target potential false joins resulting in fewer contigs coming out of the prefinishing stage. We are targeting more reverses at internal problem areas with higher error rate. Also, with a greater emphasis on sequencing BAC clones, we are hoping to more strongly target regions of adjacent ALUs as they are often the cause of gaps and false joins. Additionally, for the BACs we are trying to incorporate restriction enzyme map data to verify sections of properly aligned sequence order for the purposes of orienting contigs and identifying potential false joins.
New SaF tool features increase their usability in the directed closure phase. We have incorporated a feedback loop which identifies resequenced clones so that they don't get ordered redundantly so that we may use the automated clone selection in multiple passes and so we know when certain strategies have been played out. We describe our attempts to more tightly the SaF tools with consed. A greater emphasis is being placed in increasing the cost effectiveness of clone selection. For instance, we identify short clones so that we do not suggest sequencing their opposite ends.
Work performed under the auspices of the US DOE by Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48
|Author Index||Sequencing Technologies||Microbial Genome Program|
|Search||Mapping||Ethical, Legal, & Social Issues|