Skip to main content

AI Risk Stratification for Breast Cancer Screening Still Lacks Operational Alignment

November 3, 2025
Image: [image credit]
Photo 90972316 | Health © Elnur | Dreamstime.com

Mark Hait
Mark Hait, Contributing Editor

As clinical AI tools continue to expand their footprint across imaging workflows, new evidence suggests that predictive algorithms may hold meaningful value in identifying women at risk for interval breast cancers. A recent Radiological Society of North America (RSNA) study, published in Radiology, evaluated the performance of a mammography-based deep learning model—Mirai—against a retrospective dataset of over 130,000 mammograms from the United Kingdom’s triennial screening program. The results point to a potentially significant enhancement in the targeting of supplemental imaging.

But while algorithmic stratification of cancer risk offers a compelling path toward more personalized screening intervals, its real-world implementation remains constrained by unresolved clinical, operational, and economic factors. Capacity planning, data governance, and equity must all be addressed before predictive modeling can shift from academic potential to system-wide policy.

Interval Cancers Reflect a Persistent Diagnostic Gap

Interval breast cancers, those detected between regularly scheduled mammograms, account for a disproportionate share of poor clinical outcomes. These tumors are often more aggressive, less responsive to early treatment, and more likely to be missed by standard imaging intervals. The study, led by researchers at the University of Cambridge, found that Mirai could identify nearly half (42.4 percent) of interval cancers by flagging the top 20 percent of women with the highest AI-generated risk scores.

This performance suggests a pathway to increased early detection through targeted supplemental screening, such as contrast-enhanced mammography or MRI. While traditional risk stratification has often relied on breast density or family history, AI models can assess a broader set of image-derived features to calculate near-term cancer risk.

However, identifying elevated risk is only half the equation. Health systems must also be able to respond to that identification with timely, high-quality, and accessible follow-up imaging. That requirement introduces multiple logistical and clinical challenges.

Infrastructure Constraints Undermine Model Utility

According to the study’s authors, applying Mirai’s top-20-percent threshold to the UK’s national screening program would result in approximately 440,000 women requiring follow-up imaging each year. This represents a tenfold increase over the current volume of advanced imaging and would likely exceed available system capacity. Even in large health systems, the workforce and equipment needed to deliver supplemental MRI or contrast-enhanced exams at this scale remain limited.

This mirrors similar constraints observed in U.S.-based deployments of AI-assisted diagnostics. A 2023 JAMA Internal Medicine study on AI in breast imaging noted that risk prediction often outpaces operational readiness. In other words, the algorithm can identify who might benefit from additional imaging, but the health system may not be able to provide it in a timely or equitable manner.

There is also the question of economic sustainability. Supplemental imaging, especially when repeated at higher frequency, introduces new costs to payers and patients. Without integrated cost-benefit modeling and reimbursement alignment, large-scale risk-based recalls could increase financial strain without corresponding improvements in outcomes.

Ethical and Equity Considerations

Stratification by AI also carries implications for health equity. The Mirai algorithm showed reduced predictive performance in women with extremely dense breast tissue, a population already at higher risk of missed diagnoses under standard screening protocols. Without adjustment, models that underperform for specific subgroups risk reinforcing existing disparities.

Current algorithms are also trained on retrospective data that may reflect institutional or regional biases. As adoption scales, institutions must ensure that predictive models are externally validated across diverse populations. The National Institutes of Health has emphasized the importance of representative datasets in medical AI to avoid systemic underperformance for racial and ethnic minorities.

Moreover, informed consent practices have not yet caught up to the use of algorithmically derived risk scores. Patients flagged for follow-up may not understand how their risk was calculated or how the tool performed in similar populations. This adds another layer of complexity to already sensitive clinical communications.

Moving From Model Accuracy to System Readiness

While the study’s retrospective design limits its immediate clinical application, it does underscore a pressing need for AI governance in population health settings. As tools like Mirai continue to demonstrate predictive accuracy, system leaders must begin addressing the practical conditions required for responsible deployment.

These include defining thresholds for supplemental imaging based on capacity, integrating AI outputs into existing clinical pathways, and building transparency into how predictions are generated and used. Organizations such as the American College of Radiology and ONC have called for clearer guidance on clinical decision support integration, including how to handle AI-generated risk scores within EHR workflows.

Until such frameworks are standardized, there is a risk that model results will either be underutilized, due to clinician skepticism or alert fatigue, or over-relied upon without full understanding of their limitations.

The Future of Screening Is Not Predictive

The promise of AI in screening is not about prediction alone. It is about conditional action. Stratification tools must be embedded within systems that can respond dynamically, where flagged patients are contacted promptly, imaging is available without long delays, and follow-up care is coordinated across specialties.

This will require more than technical validation. It will demand investment in scheduling systems, care navigation infrastructure, and digital literacy among both providers and patients. The question is not whether AI can identify risk. It is whether health systems are prepared to act on that information in ways that are safe, equitable, and sustainable.

The RSNA study is a strong signal that predictive modeling in breast cancer is moving from theory to operational testing. But the infrastructure to deliver on its potential still lags. Without structural reform to match algorithmic progress, AI in screening risks becoming another tool of clinical insight that systems are not yet prepared to use effectively.