Supercomputing and Computation

SHARE

Statistics


We focus on two areas of statistics that are of importance to scalable data science: (1) Research for the development of scalable analytics and (2) research for understanding of high-dimensional response functions. (1) The development of scalable analytics is lagging far behind the simulation sciences in its use of high performance computing resources. Those who develop statistical analytics prefer to work with high level programming languages that are close to mathematics. They know what can be asynchronous in the mathematics but do not know the additional intricacies needed for developing and running codes on large computational platforms. As a result, there exist large diverse collections of serial analytical tools but only a scant number of the analytics are scalable. We are developing high level methods for programming with big data (pbd) to engage and enable this community to prototype new scalable codes. (2) Computational science codes often have large numbers of parameters that influence their output. It is often difficult to understand which parameters and parameter interactions are important over an input region of interest. While statistical techniques exist for variability attribution to parameters, they rely on designed sample spaces which are typically not available in simulation science collections either because the parameter space is too large or simply because statistical design was not used in selecting parameter combinations. We use a combination of surrogate models and analysis of variance to provide variability attribution and parameter effect estimation techniques.

Presentation: Applied Statistics for the Office of Science

Related Projects

1-1 of 1 Result
 

pbd: programming with big data
— pbd: "programming with big data" is a set of high level language programming tools, written as R packages, that enable high level programming with big data in R without the need to micro-manage distributed data.

 
 
ASK ORNL

We're always happy to get feedback from our users. Please use the Comments form to send us your comments, questions, and observations.