MOSAiC, the largest polar expedition of all time, will produce demanding quantities of data. ORNL staff in the field and the lab collect, store and process it to share with collaborators around the world.
Oct 25, 2019—In the vast frozen whiteness of the central Arctic, the Polarstern, a German research vessel, has settled into the ice for a yearlong float.
When the ship arrived at a scrupulously chosen ice floe in early October, and the dark sea water lapping against its hull began to freeze—locking it into place—passengers on board celebrated by venturing onto the ice. Some took photos and even kicked a soccer ball around the location of their new home. For the better part of a year, a web of structures and instruments will sprawl out from the ship to form a research camp—the northernmost little city in the world.
Misha Krassovski, a computer scientist who works at the Department of Energy’s Oak Ridge National Laboratory, joined the festivities, but only for about 10 minutes.
Then he scrambled back on board, into a ribbed metal shipping container holding much of the ship’s network systems. Some of the equipment needed attention.
Krassovski is one of the 60 some scientific personnel who embarked on the first leg of the largest polar expedition of all time, called the Multidisciplinary Drifting Observatory for the Study of Arctic Climate, or MOSAiC. During the yearlong expedition, the Polarstern will drift through the Arctic, frozen in ice, as around 600 experts from collaborating institutions around the world rotate on board to study the Arctic climate system, the most rapidly warming climate on the planet.
Krassovski represents one of those institutions—DOE’s Atmospheric Radiation Measurement (ARM) user facility. ARM is a resource managed by nine national laboratories that enables climate and atmospheric research through its permanent observatories and, in the case of MOSAiC, its mobile campaigns, instruments and data infrastructure.
His job was to set up ARM’s central computer—the “site data system”—and to make sure data stream to it flawlessly from the more than 50 instruments ARM has provided for the mission. Those data will be shipped periodically to ARM’s Data Center, located at ORNL, where they’ll be accessible freely by anyone.
“Data center in a can”
The instruments that ARM provided for MOSAiC will help create the most detailed record of Arctic atmosphere ever. They’ll collect data on parameters such as aerosol concentrations, precipitation and humidity, to name a few.
“You’ve got your instruments in the field, and you need systems to communicate with those instruments to pull the data off. We’re responsible for making sure that those systems are online,” said ORNL’s Cory Stuart, who manages all the site data systems for ARM’s mobile campaigns. “I’ve heard people say we’ve got a data center in a can.”
During the course of the MOSAiC expedition, ARM’s instruments are expected to produce about 250 terabytes (TB) of data. For context, many newer laptops can store around one TB.
“It’s like 250 times your typical laptop,” ARM Data Center director Giri Prakash said. “It’s quite a bit of diverse data, and we are fully ready to handle it.”
This isn’t too unusual for ARM, which boasts 1.8 petabytes (around 1,800 times your laptop) of atmospheric data in its collection and regularly handles large sets of data from the field. The challenge, in this case, is that the treacherous environmental conditions in the Arctic will make it more difficult to transfer much of that information before the campaign is complete. While the plan does include shipping data back to the U.S. on disks throughout the deployment, ARM still must be prepared to store all 250 TB on the onsite system to minimize the chance of losing any data.
The onsite system is a set of servers that occupies one rack, which stands about 6 feet tall by 2 feet wide and includes a storage array with 96 hard drives. Data from all the instruments flow via network or serial communication channels to each instrument computer. These instrument computers are either physical systems, like a laptop, or virtual machines, which are software emulated computers running on servers. From there, the site data system pulls it into the local storage system.
The ARM team did their homework to ensure a smooth setup. They tested the system at Los Alamos National Laboratory months before the expedition began and again, dockside, just before the Polarstern set sail in September. The goal was to eliminate any surprises.
“My hope for him [Krassovski] was that it would be a really cool experience, but that he’d be really bored,” Stuart said, smiling. “Because with those data systems the hope is that they come up, and they run, and you don’t have to do much.”
Krassovski didn’t have time to get bored. In addition to troubleshooting the network systems, he kept busy helping with cable support poles, shelters, tents, flags and other items that must be installed before scientists can set up equipment and start doing measurements. One day he helped build 250 supports for electrical cables that will spread over the ice.
“It all requires a lot of people, and volunteers are always appreciated,” Krassovski said.
That supporting attitude is what landed him a spot on the Polarstern in the first place: Krassovski normally does not work with ARM. He volunteered for MOSAiC when conflicting schedules prevented Stuart and other members of the ARM team from going. Though he’s done similar work for ORNL’s Environmental Sciences Division in other frigid locations, such as northern Minnesota and Alaska, jumping in with a different group meant learning an entirely new data system very quickly.
“This is a fantastic example of inter-program collaboration,” Stuart said. “Misha [Krassovski] is a rock star.”
Sharing with the world
Krassovski is currently aboard another research icebreaker headed back to port in Tromso, Norway. When he arrives at the end of October, the ARM Data Center’s involvement in MOSAiC will be far from over. Once he delivers the first USB hard drives to Oak Ridge, the goal is to have the data processed and accessible within a week.
“As soon as it gets here, we do all the processing and make it available as quickly as possible,” Prakash said. “We are ready for that, and we practiced it.”
While ARM data are readily accessible to scientists and other users worldwide, Prakash has been working with other international collaborators, such as the Alfred Wegener Institute, the German institution leading the expedition, to increase the visibility of the data to all MOSAiC participants.
“We are prepared and excited to do our job so the researchers can do their wonderful science,” Prakash said.
MOSAiC is supported by DOE’s Office of Science through ARM, a DOE Office of Science user facility, and partial direct funding for the MOSAiC campaign.
UT-Battelle manages ORNL for the DOE Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit www.energy.gov/science. — by Abby Bower