Artificial Intelligence System Reduces False-Positive Finding in the Interpretation of Breast Ultrasound Exams

Linda Moy, MD, Center for Advanced Imaging Innovation and Research, with NYU Langone Health Center for Advanced Imaging Innovation and Research, reported the results of a clinical study that used artificial intelligence (AI) to reduce false positives in breast ultrasounds (CAI2R ), during RSNA 2021. Moy, a leader in radiology AI, is also a professor of radiology at NYU Grossman School of Medicine and a member of the Perlmutter Cancer Center.

Led by researchers from NYU Langone Health’s Department of Radiology and its Laura and Isaac Perlmutter Cancer Center, the team’s AI analysis is believed to be the largest of its kind.

In addition to Moy, who served as study co-investigator, the study was conducted by the following team: Senior Investigator Krzysztof J. Geras, PhD, Co-Lead Investigators Yiqiu “Artie” Shen, Farah Shamout, and Jamie Oliver; and co-investigators Jan Witowski, Kawshik Kannan, Jungkyu Park, Nan Wu, Connor Huddleston, Stacey Wolfson, Alexandra Millet, Robin Ehrenpreis, Divya Awal, Cathy Tyma, Naziya Samreen, Yiming Gao, Chloe Chhor, Stacey Gandhi, Cindy Lee, Sheila Kumari-Subaiya, Cindy Leonard, Reyhan Mohammed, Christopher Moczulski, Jaime Altabet, James Babb, Alana Lewin, Beatriu Reig, and Laura Heacock.

The study, published in the journal Nature Communications (09/24/2021), and supported by the US National Science Foundation (NSF), provided this overview:

Researchers working on an initiative supported by the US National Science Foundation trained AI to detect breast cancer using data from previously performed ultrasound scans. The AI ​​tool has greatly improved accurate diagnoses.

“If our efforts to use machine learning as a triage tool for ultrasound studies prove successful, ultrasound could become a more effective tool in breast cancer screening, particularly as an alternative to mammography and for women with dense breasts,” Moy said. “His future impact on improving women’s breast health could be profound,” she added. The summary of the study is presented here.

bust pictures

Breast ultrasound images show cancer (as a dark spot in the center on the left and in red on the right, as highlighted by a computer). Image courtesy of Nature Communications

Abstract:

Ultrasound is an important imaging technique used to detect and characterize breast cancer. Although it has been consistently shown to detect occult cancers mammographically, breast ultrasound has been found to have high false-positive rates.

In this work, an AI system is presented that achieves radiologist-level accuracy in detecting breast cancer in ultrasound images.

Developed on 288,767 exams consisting of 5,442,907 B-mode and color Doppler images, the AI ​​achieves an area under the Receiver Operating Characteristic Curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI ​​achieves a higher AUROC than the average of ten board-certified chest radiologists (AUROC: 0.962 AI, 0.924 ± 0.02 radiologists). With the help of AI, radiologists reduce their false positive rates by 37.3% and reduce requested biopsies by 27.8%, with the same sensitivity. This underscores the potential of AI in improving the accuracy, consistency and efficiency of breast ultrasound diagnostics.

Materials and Methods – Model

• Development of an AI system using a DCNN trained on a Globally-Aware Multiple Insurance Classifier

• Low-monitored model that automatically identifies malignant and benign lesions without requiring manual annotations from radiologists

• Pathology was used as a reference standard

• Details on data preprocessing, labeling, annotation and ground truth

• Dataset was split at patient level into training (60%), validation (10%) and testing (30%) databases.

NYU Breast Ultrasound Record

• The AI ​​system was trained on an internal data set of 288,767 ultrasound examinations with a total of 5,442,907 images acquired between 2012 and 2019 from 143,203 patients

• 20 imaging centers performing screening and diagnostic ultrasound examinations

• 28,914 of these investigations involved at least one biopsy procedure

• 5,593 of these had biopsies with malignant findings.

Results

• Across a panel of 44,755 exams, the AI ​​system achieved an AUC of 0.976 for exam identification of malignancy

• Among the 663 reader study exams, the AI ​​system had an AUC of 0.962, exceeding the average of ten radiologists (0.924 +/- 0.02). p<0.001

• At the radiologist’s average sensitivity (90.1%), the AI ​​system had higher specificity (85.6% vs. 80.7%, p<0.001)

• The AI ​​system recommended fewer biopsies (19.8% vs. 24.3%) p<0.001.

Reader study – hybrid model

• The hybrid models improved the radiologist’s AUC from 0.929 to 0.960

• At the radiologist’s sensitivity levels, the hybrid models offer:

• Increase in radiologist mean specificity from 80.7% to 88.4% (p<0.001)

• Radiologist PPV increased from 27.1% to 39.2% (p<0.001

• The hybrid models reduced the mean biopsy rate from 24.3% to 17.2% (p<0.001)

• The reduction in biopsies using the hybrid models accounted for 29.4% of all recommended biopsies.

Conclusion

• The AI ​​system detected and diagnosed cancer on breast ultrasound with an accuracy that surpasses that of experienced board-certified radiologists

• AI decision support reduced unnecessary biopsies

• The hybrid decision-making models can potentially improve the performance of breast imaging devices without incurring the additional cost of a second human reader

• The system could be used to support decision-making when there is a shortage of radiologists.

The study’s conclusion offered the researcher’s perspective on future clinical applications and the impact of artificial intelligence on efforts to improve breast cancer imaging accuracy.
In it, the authors offered the following about their findings:

“Finally, we examined the potential of AI in the assessment of US exams. We showed in a reader study that deep learning models trained with a large enough amount of data are able to make diagnoses as accurate as experienced radiologists. We have further shown that collaboration between AI and radiologists can significantly improve their specificity and avoid 27.8% of requested biopsies. We believe this research could complement future approaches to breast cancer diagnosis. Furthermore, the general approach of our work, mainly the framework for weakly supervised classification and localization, may allow the use of deep learning in similar medical image analysis tasks.”

Artificial intelligence system for automated triage of breast ultrasound exams

Following is a clinical snapshot of a second study presented by Linda Moy, MD, during the 2021 RSNA session: Breast Imaging: Advanced Breast Ultrasound.

Authors included Jamie Oliver, BA, Beatrice Reig, MD, MPH, Yiming Gao, MD, Alan Lewin, MD, Linda Moy, MD, Laura Heacock, MD.

Hypothesis: A DL model trained to triage breast ultrasound exams as cancer-free can improve the radiologist’s efficiency and specificity without compromising sensitivity.

Purpose: Training an AI system for breast exam triage to reallocate radiologists’ time to exams with high suspicion of malignancy.

Materials and Methods – Dataset

The AI ​​system was trained with an internal data set of 288,767 ultrasound examinations with a total of 5,442,907 images, acquired between 2012 and 2019 from 143,203 patients.

• 20 imaging centers performing screening and diagnostic ultrasound examinations

• 28,914 of these investigations involved at least one biopsy procedure

• 5,593 of these had biopsies with malignant findings

Results

• In a test of 44,755 exams, the AI ​​system achieved an AUC of 0.96 and identified exams with malignant lesions

• When the triage system evaluated 3,553 exams originally rated as B1-RADS 3, it classified 60%, 70%, and 80% of the exams with the lowest CI values ​​as benign, without missing malignancy.

• The AI ​​system can eliminate the need for follow-up imaging

discussion

• With a high sensitivity threshold, our DL model can function as a standalone system

• Triage of 60-80% of breast ultrasounds from the radiologist’s work list with a false-negative rate of 0.008-0.03%

• Using a high sensitivity threshold, our DL model placed 978 (2.2%) exams in an improved scoring workflow with a high PPV of 69.6%.

Clinical Relevance

• AI decision support reduced unnecessary biopsies and follow-up visits

• The system could be used to support decision-making when there is a shortage of radiologists.