Abstract
Cyber defenders are overwhelmed by the frequency and scale of attacks against their networks. This problem will only be exacerbated as attackers leverage AI to automate their workflows. Autonomous cyber defense capabilities could aid defenders by automating operations and adapting dynamically to novel threats. However, existing training environments fall short in areas such as generalization, explainability, scalability, and transferability, making it intractable to train agents that will be effective in real networks. In this paper we take an important step towards creating autonomous cyber defense agents — we present a high fidelity training environment called Cyberwheel that includes both simulation and emulation capabilities. Cyberwheel simplifies customization of the training network and easily allows redefining the agent’s reward function, observation space, and action space to support rapid experimentation of novel approaches to agent design. It also provides visibility into agent behaviors necessary for agent evaluation and sufficient documentation / examples to lower the barrier to entry. As an example use case of Cyberwheel, we present initial results training an autonomous agent to deploy cyber deception strategies in simulation.