Supercomputing and Computation



pbd: programming with big data

pbd: "programming with big data" is a set of high level language programming tools, written as R packages, that enable high level programming with big data in R without the need to micro-manage distributed data. The intent is to use a familiar serial programming syntax in R, which is close to data mathematics, while being mindful of the data distribution: "Old syntax with a new mindset." We provide the ability to program analytics sequences with minimal data movement among the distributed components. Our goal is to engage and enable analytics developers in R who have a mathematical mindset to create the needed diversity in scalable analytics. Currently we have four packages that are pending release: pbdMPI - a more intuitive and faster R interface to MPI, pbdSLAP - connecting scalable linear algebra libraries to R, pbdDMAC - intuitive distributed matrix algebra in R, and pbdBMTK - a benchmarking toolkit for pbd codes.


We're always happy to get feedback from our users. Please use the Comments form to send us your comments, questions, and observations.