Jayaraj, Sujitha. Digital pathology with deep learning for diagnosis of breast cancer in low-resource settings. Retrieved from https://doi.org/doi:10.7282/t3-q9vn-tj42
DescriptionPathologic assessment of tissue sections is an important part of breast cancer diagnosis, with early diagnosis allowing early treatment and potentially better outcomes. Unfortunately, the lack of pathologists in low-resource settings leads to a longer turnaround period between specimen collection and diagnosis. Digital pathology with machine learning is proposed as a possible solution to reduce these critical delays. In this thesis, the use of convolutional neural networkstrained using open-source image datasets for predicting breast cancer is proposed as a means of providing a preliminary screen of histopathology specimens. Three open-source datasets namely Breast Cancer Histopathology (BreaKHis), BreAst Cancer Histology (BACH), and Breast Cancer Histopathological Annotation and Diagnosis (BreCaHAD) were used as sources of images for training and testing neural networks.
Two deep convolutional neural networks (CNNs) namely Inception-v3 and Residual Neural Network – 101 (ResNet-101) were trained to classify histopathology images in binary, three-way, and multi-class classification systems. The hyperparameters that optimize each CNN’s performance were identified by testing different values for learning rate, number of epochs, and momentum. The effects of pre-processing input data using techniques including patch extractionand stain normalization on classification accuracy were also tested, but these did not improve accuracy. Magnification-specific and magnification-independent binary classification into benign / malignant classes was done using the Inception v3 and ResNet-101 CNNs and images from BreaKHis. Magnification-independent classification resulted in higher accuracies. Highest accuracy was observed for ResNet-101 at 0.846 for magnification independent classification of
BreaKHis. This was followed by binary classification of images obtained from the combination of BreaKHis and BACH datasets, with the highest accuracy by Inception-v3 at 0.7584. Threeway classification of tissue into normal, benign, and malignant categories utilized a combination of images from BreaKHis, BACH, and BreCaHad along with data augmentation to balance the discrepancy in the size of datasets. Accuracies of 0.35 and 0.56 were obtained for Inception-v3 and ResNet-101 respectively. Finally, a multiclass classification into the eight subtypes of breast cancer was done using the images from BreaKHis. An overall accuracy of 0.61 was obtained with Inception-v3 and 0.52 for ResNet-101.