A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because they encode a plethora of information. When evaluating designs, we aim to capture a range of information, including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Still, many attempts have been made and metrics developed to do so, because design evaluation is integral to the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it relies on using expert ratings, making CAT expensive and time-consuming. Comparatively, SVS is less resource-demanding, but often criticized as lacking sensitivity and accuracy. We utilize the complementary strengths of both methods through machine learning. This study investigates the potential of machine learning to predict expert creativity assessments from non-expert survey results. The SVS method results in a text-rich dataset about a design. We utilize these textual design representations and the deep semantic relationships that natural language encodes to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS survey information. We show that incorporating natural language processing improves prediction results across design metrics, and that clear distinctions in the predictability of certain metrics exist.