Effective treatment of Parkinson’s disease (PD) is a continual challenge for healthcare providers, and providers can benefit from leveraging emerging technologies to supplement traditional clinic care. We develop a data-driven reinforcement learning (RL) framework to optimize PD medication regimens through wearable sensors. We leverage a data set of n = 26 PD patients who wore wrist-mounted movement trackers for two separate six-day periods. Using these data, we first build and validate a simulation model of how individual patients’ movement symptoms respond to medication administration. We then pair this simulation model with an on-policy RL algorithm that recommends optimal medication types, timing, and dosages during the day while incorporating human-in-the-loop considerations on medication administration. The results show that the RL-prescribed medication regimens outperform physicians’ medication regimens, despite physicians having access to the same data as the RL agent. To validate our results, we assess our wearable-based RL medication regimens using n = 399 PD patients from the Parkinson’s Progression Markers Initiative data set. We show that the wearable-based RL medication regimens would lead to significant symptom improvement for these patients, even more so than training RL policies directly from this data set. In doing so, we show that RL models from even small data sets of wearable data can offer novel, generalizable clinical insights and medication strategies, which may outperform those derived from larger data sets without wearable data.