Abstract
As modern neuroscience tools acquire more details about the brain, the need to move towards biological-scale neural simulations continues to grow. However, effective simulations at scale remain a challenge. Beyond just the tooling required to enable parallel execution, there is also the unique structure of the synaptic interconnectivity, which is globally sparse but has relatively high connection density and non-local interactions per neuron. There are also various practicalities to consider in high performance computing applications, such as the need for serializing neural networks to support potentially long-running simulations that require checkpoint-restart. Although acceleration on neuromorphic hardware is also a possibility, development in this space can be difficult as hardware support tends to vary between platforms and software support for larger scale models also tends to be limited.
In this paper, we focus our attention on Simulation Tool for Asynchronous Cortical Streams (STACS), a spiking neural network simulator that leverages the Charm++ parallel programming framework, with the goal of supporting biological-scale simulations as well as interoperability between platforms. Central to these goals is the implementation of scalable data structures suitable for efficiently distributing a network across parallel partitions. Here, we discuss a straightforward extension of a parallel data format with a history of use in graph partitioners, which also serves as a portable intermediate representation for different neuromorphic backends.
We perform scaling studies on the Summit supercomputer, examining the capabilities of STACS in terms of network build and storage, partitioning, and execution. We highlight how a suitably partitioned, spatially dependent synaptic structure introduces a communication workload well-suited to the multicast communication supported by Charm++. We evaluate the strong and weak scaling behavior for networks on the order of millions of neurons and billions of synapses, and show that STACS achieves competitive levels of parallel efficiency.