Skip to main content

D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation...

by Timothy S Oesch, Robert A Bridges, Kiren E Verma, Brian Weber, Oumar Diallo
Publication Type
Conference Paper
Book Title
14th Cyber Security Experimentation and Test Workshop
Publication Date
Page Numbers
1 to 9
Publisher Location
District of Columbia, United States of America
Conference Name
14th Cyber Security Experimentation and Test Workshop
Conference Location
Boston, Massachusetts, United States of America
Conference Sponsor
Conference Date

Whether testing intrusion detection systems, conducting training exercises, or creating data sets to be used by the broader cybersecurity community, realistic user behavior is a critical component of a cyber range. Existing methods either rely on network level data or replay recorded user actions to approximate real users in a network. Our work is the first to produce generative models trained on actual user data (sequences of application usage) collected on endpoints. Once trained to the user's behavioral data, these models can generate novel sequences of actions %that appear to come from the same distribution as the training data. These sequences of actions are then fed to our custom software via configuration files, which replicate those behaviors on end devices. Notably, our models are platform agnostic and could generate behavior data for any emulation software package. In this paper we present our model generation process, software architecture, and an initial evaluation of the fidelity of our models. Our software is currently deployed in a cyber range to help evaluate the efficacy of defensive cyber technologies. We suggest additional ways that the cyber community as a whole can benefit from more realistic user behavior emulation. The data used to train our model, as well as sample configuration files produced by the model, are available at [redacted].