1. Data Collected
The data obtained will be primarily experimental data from the Bio-SANS at HFIR. The external users of Bio-SANS maintain the rights to their data, traced by their unique Integrated Proposal Tracking System (IPTS) number. Upon completion of the experiment, users can access the SANS data for the list of samples measured through the ORNL Neutron Online Catalogue (OnCAT).
2. Data Standards
The data collected will be compliant with best practices and standards for the SAS community. The raw, acquisitioned data and metadata collected as part of HFIR and SNS experiments are stored on ORNL institutional clusters in the open, internationally recognized NeXus data format. Processed scattering data is stored in a four-column tabular format — i.e., x-axis, y-axis, error in y-axis and uncertainty in x-axis and will be accompanied by documentation and metadata in text-based formats. Processed SANS metadata is stored in hdf5 compressed data files. Additionally, our open-source reduction software, drtSANS saves 1D and multidimensional reduced scattering data in the XML-based canSAS (Collective Action for Nomadic Small-Angle Scatterers) format, allowing for both data and metadata be preserved in a single file once the User’s experiment is finished.
3. Related Tools, Software and/or Code
In order to process and visualize the NeXus raw data, the MANTID software package is required. MANTID is an open-source collaborative project between ORNL and several other global neutron source facilities and is available through the ORNL institutional cluster. All users with accepted proposals are given an XCAMS account and can remote access their data through analysis.sns.gov or jupyter.sns.gov. To process the raw data to 1D and 2D data, drtSANS software is required and accessed through Jupyter Notebooks from the analysis cluster. Data can be transferred from the ORNL cluster to User’s home computer directly from Jupyter Notebooks, using standard SFTP software, or directly from OnCAT once published.
4. Sharing and preservation
Data obtained through the Bio-SANS general user program will be stored on password protected ORNL institutional clusters. SANS data is cross-referenced via the IPTS to specific experiments. Data storage procedures will be consistent with the SNS/HFIR institutional data management plan. Researchers can only access their individual experiments and data will not be shared or transferred without written permission from the principal investigator (PI) of a particular experiment.
For long term data preservation, SNS/HFIR will store the raw and processed research data in line with long term preservation criteria. Currently, SNS and HFIR experiment data are stored on a file server managed by the SNS Linux Support Team. “Home” directories will be provided to the users. These directories are persistent across proposals and are currently preserved on the file system once created. To manage file system usage, older data is migrated to offline storage for archival. In addition to storing the data outputs of the research for long-term, users can generate a digital object identifier (DOI) for those datasets that can be made available openly, using the DOI workflow service OnCAT to obtain a DOI through the Office of Scientific and Technical Information’s (OSTI’s) web services. With the DOI, the dataset can be referenced in publications and by other related datasets.
6. Data Protection: Integrity and Security
Data will be secured via password-protected workstations and data enclaves at ORNL. In development, version control software such as Git will be used in conjunction with GitHub for backup, testing, and monitoring of the software changes. No personally identifiable information, national security information, or business confidential information is currently anticipated in the resource.