Cara
← Back to Insights
Agentic AI·May 2026·8 min read

The agentic year in healthcare operations: what is actually shipping in 2026.

Pilots are everywhere. Production is rare. The buyer's question in 2026 is not which agent vendor to pick, but which governance model, which evaluation harness, and which HITL boundary fits the workflow.

Agentic AIPracticesHealth SystemsPayersPharmaDigital Healthagentic AI healthcareAI agents prior authorizationvoice AI patient schedulingAI agents revenue cyclehuman in the loop healthcareHIPAA compliant AI agentshealthcare AI governanceMedAgentBench

Executive read

  • The category is splitting in two. Ambient AI (scribes, summarization) and agentic AI (multi-step workflow execution) use the same model families and have almost nothing else in common. Conflating them is the most common buyer mistake.
  • Agentic AI now has production case evidence in prior auth, payer phone calls, patient navigation, front-desk intake, and revenue cycle. The honest stat is also true: KLAS reports only one in roughly three thousand healthcare orgs is actually running agentic AI in production, while 61% of executives say they have budget secured.
  • The right buyer questions in 2026 are about governance, omission rates, and the HITL boundary per workflow. Hallucination is the wrong question.

The category is splitting in two.

Healthcare AI used to mean an LLM in a chat window. In 2026 it is splitting into two distinct things that share almost nothing besides the model family underneath. Ambient AI listens, summarizes, and writes. Scribes, listening systems, documentation assistants. Agentic AI acts. It calls payer portals, navigates EHRs, makes outbound phone calls, runs workflows across systems, and completes multi-step tasks against external state.

Conflating the two is the most common buyer mistake we see. A vendor pitches an agent, a buyer evaluates it like a copilot, and the eval misses what matters most: whether the agent can complete a bounded task end to end against the customer's real systems.

Where agents are actually earning their keep.

Prior authorization. SamaCare crossed one million agentic prior authorizations in early 2026. Optum reports 45% fewer manual touches and a 96% first-pass approval rate on its AI-powered digital PA. PrescriberPoint reports a 94.5% clinician acceptance rate on AI-generated PA submissions.

Payer phone calls and benefit verification. Infinitus has run hundreds of millions of minutes of AI-driven outbound payer calls. The category is mature enough that Infinitus shipped Studio in April 2026, a no-code agent builder for healthcare-specific voice and workflow agents.

Patient navigation and post-discharge follow-up. Memora Health (now part of Commure) and Hippocratic AI between them report tens of millions of patient interactions. Hippocratic publishes case evidence of staffing-style deployments, with multi-million patient touchpoints per customer.

Front-desk intake. Sully.ai and similar vendors are shipping AI employees that handle initial scheduling, eligibility, and intake conversations. The bar is no longer whether the agent can hold a coherent conversation. It is whether the conversation ends with a calendar slot, a completed intake form, and a routed referral.

Revenue cycle. HFMA's February 2026 survey of 95 finance leaders showed 27% have AI at scale and 53% are piloting. McKinsey projects 30 to 60% reduction in cost-to-collect from agentic revenue cycle. Denials, eligibility, and underpayment recovery are the obvious near-term wedges.

The honest skeptical stat.

KLAS interviewed roughly 3,000 healthcare organizations in 2025 and found that only one was running agentic AI in production. Sixty-one percent of executives said they had budget secured. That gap between expectation and deployment is the most important single number in this category.

The takeaway is not that agentic AI is fake. It is that 2026 is the year the gap closes for some workflows and stays open for others. The buyer's job is to know which is which before signing a multi-year contract.

A decision rule: copilot, ambient, or agent.

Three questions. Does the work require multi-step tool use against external systems like portals, phones, or EHRs? Does latency matter less than completeness? Is the failure mode bounded and reversible?

Three yeses means the workflow is an agent. Two means it is a copilot with a human in the loop. Zero or one means it is ambient.

Most agent pitches in 2026 are copilots with a louder label. The decision rule is the single most useful frame a buyer can carry into vendor demos.

Hallucination is the wrong question. Omission is the right one.

A 2025 framework published in npj Digital Medicine evaluated 12,999 clinician-annotated sentences for clinical summarization safety. The hallucination rate was 1.47%. The omission rate was 3.45%, more than twice as high.

In agentic settings the equivalent of an omission is a missed step, not a wrong fact. That reframes evaluation entirely. The right benchmark is not single-turn factuality. It is end-to-end task success, MedAgentBench-style. Anthropic reported Claude Sonnet hitting roughly 70% overall and 84% on retrieval tasks in virtual EHR agent benchmarks. That is a useful floor, not a ceiling, and not a license for unsupervised autonomy.

Buyers should ask for the omission rate, the missed-step rate, and the end-to-end task completion rate of any agent they are evaluating. If the vendor cannot produce all three, the workflow is not ready for agent deployment.

Governance is what the federal posture is converging on.

The HHS ONC RFI in December 2025 made one thing explicit. Federal guidance is not converging on prescriptive model rules. It is converging on governance frameworks, continuous monitoring, auditability, and human oversight. That changes the buyer's evaluation criteria.

Three concrete questions to ask any vendor. Where is PHI at rest and in transit, and who holds the BAA? What is the evaluation harness, the cadence, and the rollback path? Who is on the loop, where in the workflow, and how does the handoff actually fire?

CMS's Interoperability and Prior Authorization Final Rule took effect on January 1, 2026, cutting standard PA decisions from 14 days to 7 and expedited PA to 72 hours. CMS also launched the WISeR pilot in 2025, allowing AI to participate in PA decisions across six states (Arizona, Ohio, Oklahoma, New Jersey, Texas, and Washington). That is the regulatory tailwind for the entire agentic PA category, but it does not absolve the buyer of the governance question. It sharpens it.

Build, buy, or embed.

The commodity layer (scribes, generic voice, off-the-shelf intake) should be bought. The market is mature and competitive. The intelligence layer, where the customer's data and workflows are non-generic, is the build-or-co-build question.

There is a third option the market under-discusses. Embed. A Forward Deployed engineering team that builds the agent against the customer's stack, then transfers it. The tradeoff is honest: vendor agents ship faster, embedded teams produce a system the customer owns. The right answer depends on whether the workflow is differentiating enough to be worth owning.

Cara's view: the work that matters in 2026 is not picking a vendor. It is picking the right governance model, the right evaluation harness, and the right HITL boundary per workflow. The vendor decision is downstream of those three.