This paper describes the application of machine learning approaches for predictive modeling to improve the estimation of risks for complications of allogeneic hematopoietic cell transplantation (HCT) including relapse, graft-versus-host disease, and transplant-related mortality (TRM). Clinical disease and demographic factors known to impact the outcome of HCT include: recipient and donor age, type of donor (related/unrelated), donor-recipient gender, diagnosis and disease status pre-HCT, and stem cell source (peripheral blood, marrow, and umbilical cord blood). However, biostatistical analysis of risk has only limited accuracy in estimating a given patient’s risks of serous post-HCT complications. We describe the application of standard support vector machine (SVM) classifiers for data-analytic modeling of TRM. The goal is to predict the binary output TRM (alive or dead) from a set of genetic, demographic, and clinical inputs. Classification decision rule is estimated using SVM approach appropriate for such sparse multivariate data. This study compares several feature selection techniques for modeling TRM and objectively evaluates the quality of feature selection via prediction accuracy of the corresponding SVM classifiers. In addition, we discuss methods for interpretation of multivariate SVM models.

You do not currently have access to this content.