ORNL’s Piranha & Raptor Text Mining Technology

UT-Battelle, LLC, acting under its Prime Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy (DOE) for the management and operation of the Oak Ridge National Laboratory (ORNL), is seeking a commercialization partner for the Piranha/Raptor text mining technologies.  The ORNL Technology Transfer Office will accept licensing applications through January 31, 2014.

ORNL’s Piranha and Raptor text mining technology solves the challenge most users face: finding a way to sift through large amounts of data that provide accurate and relevant information. This requires software that can quickly filter, relate, and show documents and relationships. Piranha is JavaScript search, analysis, storage, and retrieval software for uncertain, vague, or complex information retrieval from multiple sources such as the Internet. With the Piranha suite, researchers have pioneered an agent approach to text analysis that uses a large number of agents distributed over very large computer clusters. Piranha is faster than conventional software and provides the capability to cluster massive amounts of textual information relatively quickly due to the scalability of the agent architecture.

While computers can analyze massive amounts of data, the sheer volume of data makes the most promising approaches impractical.  Piranha works on hundreds of raw data formats, and can process data extremely fast, on typical computers.  The technology enables advanced textual analysis to be accomplished with unprecedented accuracy on very large and dynamic data. For data already acquired, this design allows discovery of new opportunities or new areas of concern. Piranha has been vetted in the scientific community as well as in a number of real-world applications.

The Raptor technology enables Piranha to run on SharePoint and MS SQL servers and can also operate as a filter for Piranha to make processing more efficient for larger volumes of text.  The Raptor technology uses a set of documents as seed documents to recommend documents of interest from a large, target set of documents. The computer code provides results that show the recommended documents with the highest similarity to the seed documents.

For additional technology, please see DTHSTR

License applications will be evaluated based on prospective partners' ability and commitment to successfully commercialize the technology, with a preference for United States based businesses and small businesses.

For additional information and license application, contact David Sims, Commercialization Manager, Oak Ridge National Laboratory, 865-241-3808,

Intellectual Property


System/Method for Gathering and Summarizing Internet Information (ID-1031)
Inventors: T. Potok, M. Elmore, J. Reed, N. Samatova, J. Treadwell
US Patent #s 7,072,883, 7,315,858, 7,693,903 (issued July 4, 2006; January 1, 2008, April 6, 2010 respectively)

Agent-based Method for Distributed Clustering of Textual Information (ID-1368)
Inventors: T. Potok, M. Elmore, J. Reed, J. Treadwell
US Patent #7,805,446 (issued September 28, 2010)

Dynamic Reduction of Dimensions of a Document Vector in a Document Search and Retrieval System (ID-1759)
Inventors: Y. Jiao, T. Potok
US Patent # 7,937,389 (issued May 3, 2011)

Method and System for Determining Precursors of Health Abnormalities from Processing Medical Records (IDs 2235/2377)
Inventors: B. Beckerman, R. Patton, T. Potok
US Patent Application # 13/033,756 (filed Feburary 24, 2011)


PIRANHA: A Knowledge Discovery Engine (CR-50000004)
Authors: Brian Klump, Robert Patton, Tom Potok, Joel Reed, Jim Treadwell, Craig Cunic, Phillip Martin
US Copyright Registration # TXu 1-703-690 (July 12, 2010)

RAPTOR: An Enterprise Knowledge Discovery Engine, Version 2.0 (CR-50000045)
Authors: Robert Patton, Steven Young
US Copyright Registration # PENDING


Research group’s project page

Piranha-Raptor Fact Sheet

SPARK! presentation


R. M. Patton, B. G. Beckerman, T. E. Potok, G. Tourassi, "A Recommender System for Web-Based Discovery and Refinement of Information Radiologists Seek", Radiological Society of North Amercia (RSNA), 2012 Annual Meeting, Nov. 2012, Chicago, IL, USA.

R. M. Patton, T. E. Potok, B. A. Worley, "Discovery & Refinement of Scientific Information via a Recommender System", The Second International Conference on Advanced Communications and Computation, Oct. 2012, Venice, Italy.

Steed, Chad A. (ORNL), Symons, Christopher T. (ORNL), DeNap, Frank (ORNL), Potok, Thomas E. (ORNL), “Guided Text Analysis Using Adaptive Visual Analytics,” Paper in Conf. Proceedings (book, CD), Visualization and Data Analysis 2012, Burlingame, California, January 23-25, 2012.

Patton, Robert M. (ORNL), McNair, Wade (ORNL), Symons, Christopher T. (ORNL), Treadwell, Jim N. (ORNL), Potok, Thomas E. (ORNL), “A Text Analysis Approach to Motivate Knowledge Sharing via Microsoft SharePoint,” Paper in Conf. Proceedings (book, CD), 45th Hawaii International Conference on System Sciences, Wailea, Hawaii, January 4, 2012.

Patton, Robert M. (ORNL), Rojas, Carlos C. (ORNL), Beckerman, Barbara G. (ORNL), Potok, Thomas E. (ORNL), “A Computational Framework for Search, Discovery, and Trending of Patient Health in Radiology Reports,” Paper in Conf. Proceedings (book, CD), 1st IEEE Conference on Healthcare Informatics, Imaging, and Systems Biology, San Jose, California, July 2011.

Patton, Robert M. (ORNL), Beckerman, Barbara G. (ORNL), Potok, Thomas E. (ORNL), Analysis and Classification of Mammography Reports Using Maximum Variation Sampling, Stephen L. Smith and Stefano Cagnoni (Eds.), Genetic and Evolutionary Computation: Medical Applications, pp. 113-131, Wiley Publishing, West Sussex, United Kingdom, January 2011.

Cui, Xiaohui (ORNL), Mueller, Frank (North Carolina State University), Zhang, Yongpeng (ORNL), Potok, Thomas E. (ORNL), “Data-Intensive Document Clustering on GPU Clusters,” Journal of Parallel and Distributed Computing, December 2010.

Patton, Robert M. (ORNL), Beckerman, Barbara G. (ORNL), Potok, Thomas E. (ORNL), Treadwell, Jim N. (ORNL), Genetic Algorithm for Analysis of Abdominal Aortic Aneurysms in Radiology Reports, Paper in Conf. Proceedings (book, CD), 2010 Genetic and Evolutionary Computation Conference, Portland, Oregon, July 2010. Genetic Algorithm for Analysis of Abdominal Aortic Aneurysms in Radiology Reports.

Cui, Xiaohui (ORNL), Potok, Thomas E(ORNL), Cavanagh, Joseph M(ORNL), Parallel Latent Semantic Analysis using a Graphics Processing Unit, Paper in conf proceedings (book, CD), 2009 Genetic and Evolutionary Computation Conference, July 2009.Parallel Latent Semantic Analysis using a Graphics Processing Unit.

Patton, Robert M (ORNL), Potok, Thomas E(ORNL), Beckerman, Barbara G(ORNL), Treadwell, Jim N(ORNL), A Genetic Algorithm for Learning Significant Phrase Patterns in Radiology Reports, Paper in conf proceedings (book, CD), Genetic and Evolutionary Computation Conference 2009, Montreal, CAN, July 2009.A Genetic Algorithm for Learning Significant Phrase Patterns in Radiology Reports.

X. Cui, J. M. Beaver, J. St. Charles, T. E. Potok, Dimensionality Reduction for High Dimensional Particle Swarm Clustering, Proceedings of the IEEE Swarm Intelligence Symposium, September, 2008, St. Louis, USA

Patton, Robert M (ORNL), Potok, Thomas E(ORNL), Identifying Event Impacts by Monitoring the News Media, Paper in conf proceedings (book, CD), 12th International Conference on Information Visualization, London, UK, July 2008. Identifying Event Impacts by Monitoring the News Media.

Patton, R.M., Cui, X., Jiao, Y., and Potok, T.E. (2008). Evolutionary computing. Intelligent Data Analysis: Developing New Methodologies through Patton Discovery and Recovery, Idea Group Inc., Hershey, P.A.

X. Cui and T. E. Potok, A Particle Swarm Social Model for Multi-Agent Based Insurgency Warfare Simulation, Proceedings of the IEEE Eighth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, August, 2007, Busan, Korea

J. W. Reed, T. E. Potok, and R. M. Patton, "A multi-agent system for distributed cluster analysis," in Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi- Agent Systems (SELMAS'04)" W16L Workshop - 26th International Conference on Software Engineering Edinburgh, Scotland, UK: IEE, 2004, pp. 152-5.

J. Reed, Y. Jiao, T. E. Potok, B. Klump, M. Elmore, and A. R. Hurson, "TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams," in Proceedings of 5th International Conference on Machine Learning and Applications (ICMLA'06). vol. 0 ORLANDO, FL, 2006, pp. 258-263.

P. Yan, Y. Jiao, A. R. Hurson, and T. E. Potok, "Semantic-based information retrieval of biomedical data," in Proceedings of the 2006 ACM symposium on Applied computing Dijon, France: ACM Press, 2006.

T. E. Potok, M. T. Elmore, J. W. Reed, and N. F. Samatova, "An ontology-based HTML to XML conversion using intelligent agents," in Proceedings of the 35th Annual Hawaii International Conference on System Sciences Big Island, HI, USA: IEEE Comput. Soc, 2002, pp. 1220-9.

R. M. Patton and T. E. Potok, "Characterizing large text corpora using a maximum variation sampling genetic algorithm," in Proceedings of the 8th annual conference on Genetic and evolutionary computation Seattle, Washington, USA: ACM Press, 2006.

P. Palathingal, T. E. Potok, and R. M. Patton, "Agent based approach for searching, mining and managing enormous amounts of spatial image data," in Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2005 - Recent 4 2007 R&D 100 Award Entry Form Advances in Artifical Intelligence Clearwater Beach, FL, United States: American Association for Artificial Intelligence, Menlo Park, CA 94025-3496, United States, 2005, pp. 351-356.

M. T. Elmore, T. E. Potok, and F. T. Sheldon, "Dynamic data fusion using an ontology-based software agent system," in Proceedings of 7th World Multiconference on Systemics, Cybernetics and Informatics (SCI 2003) vol. Vol.9 Orlando, FL, USA: IIIS, 2003, pp. 5-10.

X. Cui, T. E. Potok, and P. Palathingal, "Document clustering using particle swarm optimization," in Proceedings of 2005 IEEE Swarm Intelligence Symposium Pasadena, CA, USA: IEEE, 2005, pp. 185-91.

[. Cui and T. E. Potok, “A distributed agent implementation of multiple species Flocking model for document partitioning clustering,” in Lecture Notes in Computer Science. vol. 4149 NAI Edinburgh, United Kingdom: Springer Verlag, Heidelberg, D-69121, Germany, 2006, pp. 124- 137.

[1] X. Cui and T. E. Potok, "Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm," Journal of Computer Sciences, vol. Special Issue, pp. 27-33, 2005.



We're always happy to get feedback from our users. Please use the Comments form to send us your comments, questions, and observations.