Skip to main content

AI Trial Screening Shifts the Bottlenecks

December 15, 2025
Image: [image credit]
ID 119285809 | Ai Health Care © Leowolfert | Dreamstime.com

Mark Hait
Mark Hait, Contributing Editor

The announcement that Mass General Brigham has spun out AIwithCare to commercialize a retrieval-augmented screening platform signals something more consequential than another hospital-built analytics tool. Clinical trial enrollment remains one of the most stubborn constraints on biomedical progress, and it has stayed stubborn despite decades of site networks, CRO scale, and patient-facing marketing. When a health system treats prescreening as software infrastructure instead of coordinator labor, the bottleneck does not disappear. It relocates into governance, data quality, and accountability.

That shift matters because enrollment is not simply an operational headache. Recruitment shortfalls can compromise statistical power, extend timelines, increase sponsor spend, and delay patient access to therapies that only exist inside protocols. In the literature, under-recruitment and retention problems are often described as pervasive, with some analyses reporting very high failure rates across therapeutic areas. (PMC)

The recruitment problem that refuses to scale

The industry’s default response to enrollment friction has been more staffing, more sites, and more outreach. Those levers can help, but they treat the symptoms while leaving the underlying mechanics intact. Eligibility criteria keep getting more complex, study teams keep relying on fragmented documentation, and research coordinators still spend long hours performing manual chart review to find the right patient at the right time.

Against that backdrop, the most valuable promise of AI-assisted screening is not speed for its own sake. It is the possibility of converting unstructured clinical reality into something that behaves like an operational signal. That conversion has financial implications because coordinator time is not only scarce, it is expensive, and it scales poorly when protocols proliferate across service lines.

RAG changes the economics of unstructured eligibility

Tools built on retrieval-augmented generation attempt to ground a model’s output in the actual source record, especially notes, reports, and narrative fields that rarely fit into discrete database queries. In a paper in NEJM AI, researchers describing the RECTIFIER approach reported performance advantages for trial prescreening in a heart failure context, a setting where eligibility often depends on nuance buried in documentation. (NEJM AI)

A subsequent randomized clinical trial published in JAMA compared manual prescreening with AI-assisted screening inside the Mass General Brigham system and reported higher efficiency and improved enrollment-related outcomes in the AI-assisted arm, with the trial running from late May through late September 2024 and involving nearly 4,500 patients. (JAMA Network) Those specifics matter because they frame the technology as something tested inside real workflows, not merely benchmarked on retrospective datasets.

The operational lesson is straightforward. If the AI layer can reliably surface likely-eligible candidates faster than manual review, the cost curve of prescreening changes. But the strategic lesson is more complicated. Once prescreening becomes software, the new limiting factors become documentation standards, auditability, and the ability to demonstrate that the system’s outputs are trustworthy enough to drive real recruitment decisions.

New failure modes arrive with deployment

Moving from an internal tool to a commercial platform introduces risks that are less visible in a single health system pilot. Model behavior can drift as note templates change, as clinicians adopt new documentation shortcuts, or as coding practices shift under payer pressure. Trial criteria also evolve, sometimes midstream, and an eligibility engine that is not tightly versioned against protocol amendments can quietly degrade without anyone noticing until enrollment or screen-failure metrics worsen.

Regulatory and compliance expectations already signal where scrutiny tends to land. The FDA has issued guidance on the use of electronic health record data in FDA-regulated clinical investigations, emphasizing reliability, provenance, and the need to understand how data are captured and modified across systems. (U.S. Food and Drug Administration) Even when an AI tool is not itself a regulated medical device, its outputs can shape decisions that feed regulated research processes. That reality pulls information security, validation, and documentation disciplines into what used to be treated as a research operations problem.

Privacy governance becomes part of the product

Clinical trial prescreening sits at an awkward intersection of care delivery, research, and population health operations. The data involved are often highly sensitive, and prescreening can involve contacting patients who did not initiate research interest. Under HIPAA, disclosures for research can occur under specific pathways, including authorizations or waivers approved by an IRB or privacy board, but that is not a trivial checkbox in a multi-site deployment model. The HHS Office for Civil Rights lays out the conditions under which protected health information may be used or disclosed for research, including documentation requirements for waivers. (HHS)

For a platform intended to scale across hospitals and clinics, privacy posture cannot be treated as a local policy nuance. It becomes a core feature. Access controls, logging, data minimization, and clear lines of responsibility for patient outreach all influence whether the technology reduces burden or creates new friction with compliance teams and institutional review boards.

Equity can improve, but it can also quietly regress

AI-assisted screening is sometimes described as a pathway to fairer recruitment because it can reduce human inconsistency and widen the funnel beyond the patients who happen to be seen by research-aware clinicians. That benefit is plausible, especially when the screening logic explicitly searches for underrepresented populations aligned to disease prevalence. Yet equity gains are not automatic. If documentation quality varies by site, specialty, or patient population, the AI will inherit those differences. If the model relies heavily on text that is more complete for patients with greater continuity of care, it can reinforce existing access gaps.

The policy context is tightening. The FDA has advanced expectations around diversity planning, including draft guidance describing Diversity Action Plans intended to improve enrollment of underrepresented populations in clinical studies. (U.S. Food and Drug Administration) Under that kind of regime, “no measured bias detected” is not the end of the story. Ongoing monitoring, transparent methods, and a documented approach to correcting disparities become part of defensible trial operations.

Commercialization forces an organizational reckoning

Health systems have long built tools that work locally and fail at export. The decision to spin out a company is an admission that scaling requires dedicated product discipline, security posture, customer support, and a business model that can survive procurement scrutiny. That is why the involvement of Mass General Brigham Innovation is not a footnote. The moment commercialization begins, buyers start asking about integration burden, liability, model updates, and who carries the risk when the tool misses a candidate or incorrectly flags one.

There is also an operational identity question embedded in this shift. Research teams often want autonomy and speed, while enterprise IT wants standardization and control. A screening platform that touches the EHR, research registries, and outreach workflows will force those groups into a shared operating model. That can be healthy, but it requires governance that many organizations have not built.

What determines whether this works at scale

Success will not be defined by a single accuracy metric or a single published trial. The decisive metrics will be operational and longitudinal: sustained reductions in coordinator screening time, stable or improved eligibility yield over time, lower screen-failure rates that do not come at the expense of diversity goals, and patient contact processes that respect privacy and consent expectations. Financially, the question is whether the platform reduces the marginal cost of enrolling each additional eligible participant, particularly in complex studies where unstructured criteria dominate.

If that bar is met, AI-assisted screening can become infrastructure, not novelty. The most important outcome would be a research enterprise that spends less time searching for patients and more time supporting them through protocols. The harder truth is that the technology only accelerates what the surrounding governance can safely absorb, and governance is now the real scaling constraint.