Smith-Waterman Transformable Array Search Engine Yields Supercomputer Performance on PCs and Unix Workstations

James W. Lindelien, Robert Farrington, Betty Tjandra

Time Logic, Inc., 11992 Challenger Ct., Moorpark, CA 93021; and 567 Knotty Pine Dr., Incline Village, NV 89451. Email to jiml@sierra.net

With government funding incentives to reduce the cost per sequenced base an order of magnitude by the year 2000, the informatics bottleneck must be addressed economically or further scale-up of massive sequencing projects will be significantly hindered. We have developed a transformable array accelerator based on Field-Programmable-Logic-Array (FPGA) technology for high-speed searches of massive nucleic and protein databases that is one to two orders of magnitude less expensive than present systems yet yields the world's fastest Smith-Waterman implementation. We hope to promote discovery by allowing more analysis per research dollar, by more biologists. The ability to retrieve search results in seconds permits a more creative "what-if" style of interactive database exploration, versus conventional but tedious "batch" style searches. System performance is scalable over 100x.

Multiple search processors are configured within an on-card FPGA array. Less computationally intensive algorithms run up to 10x faster than Smith-Waterman due to increased on-chip parallelism. Since the on-chip wiring of FPGAs can be set by software configuration h1 a fraction of a second, the array is readily transformed for a variety of search algorithms, or to first "pre-screen" large query sets with a higher speed (but less sensitive) algorithm to yield a higher daily throughput. The algorithms are:

ALGORITHM PARAMETERS PERFORMANCE/
card
Nucleotide-to-Nucleotide
comparisons, ungapped
IUB/GCG-16 Symbol Set; ktup=1. Dual scoring: k-tuple length, and max. % matches in user set "bin" size of 16 to 512 bases/bin (powers of 2). 1.28 Billion nt-nt
comparisons/sec
(ktup=1)
Amino Acid-to-Amino Acid
comparisons, ungapped
Using 5 bit 32x32 programmable matrix; ktup=1. Dual scoring: k-tuple length, and max. % matches in user set "bin" size of 16 to 512 residues/bin (powers of 2). 320 Million aa-aa
comparisons/sec
(ktup=1)
Smith-Waterman User Programmable similarity matrix; affine gap scoring to 16 bit resolution. 100 Million S-W matrix
cells/sec
Profile Search User programmable similarity matrix; affine gap scoring to 16 bit resolution. 100 Million matrix
cells/sec
Smith-Waterman
Frame-Shift Tolerant
User programmable similarity matrix; and affine gap scoring. Searches 3 nt frames vs. aa target simultaneously with user set frame shift and gap penalties. 100 Million S-W matrix
cells/sec
Profile Search
Frame-Shift Tolerant
User programmable similarity matrix; and affine gap scoring. Searches 3 nt frames vs. aa target simultaneously with user set frame shift and gap penalties. 100 Million matrix
cells/sec

For high-volume sequencing projects, the multiple query, single target design accepts in bulk the query sets produced by automated sequencers. Simultaneous processing of many short queries creates a larger "effective query size." This decouples the computation rate of the FPGA array from the disk read bandwidth, permitting very high performance without a disk bottleneck, nor the usual requirement to RAM-cache the target database. Typical Pentium-based computers with low-cost disk systems (EIDE. SCSI-2) support Smith-Waterman searches on up to 15 accelerator cards per PC, at an aggregate rate of 1.5 billion SW cells/second. Such a PC is about 110% the speed of the MasPar MP-2 16384 processor supercomputer but costs only 4% as much, and requires no annual support contract. Still higher performance is achieved by TCP/IP (NFS) network clustering. We offer individual cards and development assistance to academic researchers having other applications for the technology.

This project was privately funded by Time Logic, Inc.


Abstracts scanned from text submitted for January 1996 DOE Human Genome Program Contractor-Grantee Workshop.

Return to Table of Contents