Abstract
We examine the use of electronic healthcare reimbursement claims (EHRC) for analyzing healthcare delivery and practice patterns across the United States (US). We show that EHRCs are correlated with disease incidence estimates published by the Centers for Disease Control. Further, by analyzing over 1 billion EHRCs, we track patterns of clinical procedures administered to patients with autism spectrum disorder (ASD), heart disease (HD) and breast cancer (BC) using sequential pattern mining algorithms. Our analyses reveal that in contrast to treating HD and BC, clinical procedures for ASD diagnoses are highly varied leading up to and after the ASD diagnoses. The discovered clinical procedure sequences also reveal significant differences in the overall costs incurred across different parts of the US, indicating a lack of consensus amongst practitioners in treating ASD patients. We show that a data-driven approach to understand clinical trajectories using EHRC can provide quantitative insights into how to better manage and treat patients. Based on our experience, we also discuss emerging challenges in using EHRC datasets for gaining insights into the state of contemporary healthcare delivery and practice in the US.