Abstract
Truck transportation remains the dominant mode of US freight transportation because of its advantages, such as the flexibility of accessing pickup and drop-off points and faster delivery. Because of the massive freight volume transported by trucks, understanding the effects of population and employment characteristics on truck flows is critical for better transportation planning and investment decisions. The US Federal Highway Administration published a truck travel origin-destination data set as part of the Next Generation National Household Travel Survey program. This data set contains the total number of truck trips in 2020 within and between 583 predefined zones encompassing metropolitan and nonmetropolitan statistical areas within each state and Washington, DC. In this study, origin-destination-level truck trip flow data was augmented to include zone-level population and employment characteristics from the US Census Bureau. Census population and County Business Patterns data were included. The final data set was used to train a machine learning algorithm-based model, Extreme Gradient Boosting (XGBoost), where the target variable is the number of total truck trips. Shapley Additive ExPlanation (SHAP) was adopted to explain the model results. Results showed that the distance between the zones was the most important variable and had a nonlinear relationship with truck flows.