Injury due to underbody loading is increasingly relevant to the safety of the modern warfighter. To accurately evaluate injury risk in this loading modality, a biofidelic anthropomorphic test device (e.g., dummy) is required. Finite element model counterparts to the physical dummies are also useful tools in the evaluation of injury risk, but require validated constitutive material models used in the dummy. However, material model fitting can result in models that are over-fit: they match well with the data they were trained on, but do not extrapolate well to new loading scenarios. In this study, we used a hierarchical approach. Material models created from coupon-level tests were evaluated at the component level, and then verified using blinded component and whole body (WB) tests to establish a material model of the anthropomorphic test device (ATD) neck that was not over-fit. Additionally, a combined metric is introduced that incorporates the well-known correlation analysis (CORA) score with peak characteristics to holistically evaluate the material model performance. A Bergstrom Boyce material model fit to one loop of combined compression and tension experimental data performed the best within the training datasets. Its combined metric scores were 2.51 and 2.18 (max score of 3) in a constrained neck and head neck setup, respectively. In the blinded evaluation including flexed, extended, and WB simulations, similar combined scores were observed with 2.44, 2.26, and 2.60, respectively. The agreement between the combined scores in the training and validation dataset indicated that model was not over-fit and can be extrapolated into untested, but similar loading scenarios.