Agentic AI Will Replace Undocumented Science
![Image: [image credit]](/wp-content/uploads/c48b8eab-7e76-491c-9636-ed91699eedbd.jpg)

The question raised by Cedars-Sinai about whether artificial intelligence will replace human scientists is too narrow. The more important issue is whether agentic AI will make traditional research governance obsolete before institutions are ready to replace it with something stronger. In a recent Cedars-Sinai discussion, Jason Moore, PhD, chair of computational biomedicine, described agentic AI as a form of in silico team science capable of coordinating specialized AI agents to support biomedical research workflows. The concept is also examined in a Nature Biotechnology paper on agentic AI and in silico team science in biomedical research, which frames these systems as computational collaborators that can assist with literature review, hypothesis generation, data analysis, interpretation, software engineering, and scientific writing.
That does not mean human scientists are about to disappear from the laboratory. It means the boundaries of scientific labor are becoming harder to see. A graduate student may once have performed a literature review, written code, selected a statistical method, interpreted a dataset, and drafted a manuscript section. In an agentic workflow, those tasks may be distributed across multiple AI systems that operate quickly, iteratively, and partly outside the ordinary visibility of principal investigators, reviewers, sponsors, and institutional oversight bodies.
The replacement risk, therefore, is not primarily about headcount. It is about traceability. Agentic AI will not simply replace scientists. It will replace undocumented science with automated decisions that either become auditable or become dangerous.
Efficiency Is Not the Same as Discovery
The most compelling case for agentic AI is productivity. Biomedical research is constrained by time, labor, cost, data complexity, and administrative burden. A system that can rapidly test code, structure datasets, compare literature, generate analytic plans, and surface patterns across high-dimensional data could help laboratories pursue more questions with fewer bottlenecks. In an era of tight funding, rising labor costs, and pressure to translate discoveries faster, that efficiency has obvious appeal.
But efficiency should not be confused with scientific discovery. Science is not only the production of outputs. It is the disciplined narrowing of uncertainty through methods that others can inspect, challenge, reproduce, and refine. A faster workflow is valuable only if the underlying reasoning remains visible. Without that visibility, agentic AI risks creating research that appears more productive while becoming less accountable.
This distinction matters because biomedical research often feeds directly into clinical decision-making, drug development, diagnostics, medical devices, population health strategies, and reimbursement policy. A flawed analysis in another sector may produce a bad forecast. A flawed biomedical analysis can shape patient selection, safety assumptions, trial design, clinical guidelines, or investment decisions. Speed has to be earned through stronger controls, not accepted as its own proof of progress.
The New Labor Model Needs a Chain of Custody
Agentic AI introduces a governance problem that resembles chain of custody in clinical evidence. When multiple AI agents contribute to a research workflow, institutions need to know which agent performed which task, what data it accessed, what instructions it received, what assumptions it made, what intermediate outputs it generated, and where a human accepted, modified, or rejected its work.
This is not a clerical concern. It goes directly to reproducibility and scientific integrity. If a model proposes a hypothesis, another writes code, a third interprets results, and a fourth drafts a manuscript, the final publication may conceal the actual path of reasoning. Traditional authorship statements and methods sections were not designed for teams that include autonomous computational actors. Journals, universities, sponsors, and health systems will need more explicit disclosure standards for AI-assisted research.
The National Institutes of Health has already recognized that artificial intelligence in biomedical research requires attention to responsible use, AI-ready datasets, transparency, privacy, and equity through its policy work on artificial intelligence in research. That framing is essential, but agentic AI adds a further question: whether institutions can document enough of the AI workflow to make research findings credible when no single human performed every analytic step.
The right standard should not be whether AI was used. That question will soon be too broad to be useful. The better standard is whether the institution can reconstruct the scientific decision pathway.
Regulation Will Follow the Evidence Trail
Agentic AI also changes the regulatory stakes. The U.S. Food and Drug Administration has issued guidance on using artificial intelligence to support regulatory decision-making for drug and biological products, emphasizing a risk-based credibility assessment framework for AI models used to generate information supporting safety, effectiveness, or quality decisions. The FDA has also addressed AI-enabled device software functions through lifecycle management and marketing submission recommendations, including documentation that supports evaluation of safety and effectiveness.
Those regulatory developments point to a broader reality. AI-generated research will not be judged only by whether it is novel. It will be judged by whether its outputs are credible in a defined context of use. For laboratories working near translational medicine, that context may eventually include regulatory submissions, clinical trial protocols, biomarker strategies, diagnostic validation, or real-world evidence studies.
This creates financial and operational implications for research institutions. Laboratories that use agentic AI without documentation standards may produce faster papers but weaker evidence packages. Sponsors and health systems may later discover that AI-assisted findings cannot support regulatory claims, payer discussions, or clinical adoption because the model inputs, analytic choices, validation methods, or human review steps were not adequately preserved.
Human Oversight Must Be Specific
Calls for “human in the loop” oversight are no longer sufficient. In agentic AI, the human role must be defined with precision. A scientist who merely approves an AI-generated output after the fact is not providing the same oversight as a scientist who defines the research question, reviews the data provenance, evaluates the analytic plan, stress-tests the assumptions, and interprets results against biological plausibility.
The National Institute of Standards and Technology describes its AI Risk Management Framework as a tool for managing risks to individuals, organizations, and society associated with AI systems. In biomedical science, that risk management lens should be applied at the level of the research workflow, not only the AI model. Institutions need governance for when AI can generate hypotheses, when it can select methods, when it can access identifiable or controlled datasets, when it can write code, when it can draft interpretations, and when human review must occur before the next step begins.
The weakest governance model is passive review. The strongest model assigns human responsibility at critical decision points: study design, data selection, model selection, statistical inference, interpretation, subject protection, and publication. That structure protects both patients and researchers. It also prevents accountability from dissolving into the machinery of automation.
Human Subjects Protection Cannot Be Automated Away
Agentic AI also complicates human subjects research. AI systems trained or deployed on clinical, genomic, imaging, or behavioral data can create privacy, consent, and secondary-use risks that are not always obvious at the beginning of a project. The Office for Human Research Protections has addressed institutional review board considerations for AI in human subjects research, including risks to individuals affected by AI research even when they may not meet traditional definitions of human subjects.
That issue becomes more serious when AI agents perform tasks across a research pipeline. An agent may recommend linking datasets, deriving phenotypes, extracting variables, generating synthetic data, or testing associations that were not described clearly in the original protocol. Without explicit boundaries, agentic systems could push research activity into areas that require additional review, consent analysis, privacy safeguards, or data-use limitations.
This is where institutional review boards, privacy officers, data governance committees, and research leadership need a shared operating model. AI governance cannot sit only with computational experts. The affected interests include patients, research participants, clinicians, sponsors, regulators, and communities whose data may be underrepresented or misinterpreted.
The Scientist Becomes More Accountable
The future of agentic AI in biomedical science should not be framed as a contest between humans and machines. The more likely outcome is a redistribution of work. AI systems will absorb more routine programming, screening, summarization, data transformation, and exploratory analysis. Human scientists will remain responsible for judgment, biological plausibility, ethical boundaries, methodological rigor, team leadership, and institutional accountability.
That shift may make scientists more valuable, not less. It will also make weak science harder to excuse. If agentic AI can accelerate analysis and documentation, then sloppiness, unverifiable methods, unsupported claims, and careless citation practices become less defensible. A 2026 Nature analysis warning that AI-generated invalid references may be entering scientific literature illustrates the risk of automation without verification. The problem is not that AI can hallucinate. The problem is that humans may allow hallucinated outputs to pass into the scientific record.
Biomedical institutions should treat agentic AI as a research infrastructure issue. That means audit trails, model-use policies, disclosure norms, validation requirements, cybersecurity controls, data-use restrictions, IRB coordination, authorship standards, and post-publication accountability. It also means training scientists to supervise AI systems with the same seriousness used to supervise personnel, protocols, and patient-facing research.
Agentic AI may allow one scientist to produce the work of many. That will not eliminate the need for human scientists. It will raise the standard for what human scientific responsibility means. The laboratories that benefit most will not be those that automate the most tasks. They will be those that can prove where automation helped, where human judgment intervened, and why the final evidence deserves trust.