Abstract

Breast cancer is one of the most common cancers in women. With early diagnosis, some breast cancers are highly curable. Classifying normal versus tumor breast tissues from microscopy images is an ideal case to use for deep learning and could help diagnose breast cancer with higher reproducibility. Since data preprocessing and hyperparameter tuning have impacts on breast cancer classification accuracies, training a classifier with appropriate data preprocessing and optimized hyperparameters could improve breast cancer classification accuracy. Using 12 combinations of model architectures, data preprocessing, and hyperparameter configurations, the validation accuracy was calculated using the BreAst Cancer Histology (BACH) dataset. The DenseNet201, a non-specialized model architecture, with transfer learning approach achieved 98.61% validation accuracy compared to only 64.00% for the digital pathology (DP)-specialized model architecture. The combination of image data preprocessing and hyperparameters have a profound impact on the performance of deep neural networks for image classification. To identify a well-performing neural network to classify tumor versus normal breast histology, researchers should not only focus on developing new models specifically for DP, since hyperparameter tuning for existing deep neural networks in the computer vision field could also achieve a better prediction accuracy.

This content is only available via PDF.