Can AI agents be HIPAA compliant?

Yes. The requirements are a BAA at every hop PHI touches (including the model provider's zero-retention API tier), minimum-necessary data exposure, immutable audit logging of every agent action, human review checkpoints on consequential outputs, and eval-gated deployments. Compliance has to be designed in from day one, not retrofitted.

Do OpenAI and Anthropic sign BAAs?

The major model providers offer BAA-covered, zero-data-retention API tiers suitable for PHI workflows — distinct from their consumer apps, which are not covered. Your implementation partner, cloud, and any retrieval infrastructure that touches PHI need BAAs as well.

What healthcare workflows should AI agents automate first?

High-volume, bounded-risk administrative workflows: patient intake, front-desk triage, appointment management, and clinical documentation drafting with clinician sign-off. Diagnostic decision-making is the wrong first workflow — the goal is recovering staff hours with auditable, reviewable automation.

HIPAA-compliant AI agents: what healthcare teams need to know

AI agents can absolutely run PHI-touching workflows — patient intake, documentation, front-desk triage — if compliance is an architecture decision made on day one, not a review stage bolted on at the end. The teams that get stuck aren't blocked by HIPAA; they're blocked by having built a demo first and asked the compliance question second.

The foundation is contractual: a BAA at every hop PHI touches. That includes the model provider — the major vendors all offer BAA-covered, zero-retention API tiers now (the consumer apps are not that; using them on PHI is how organizations end up in breach reporting). It includes your cloud, your vector store if retrieval touches PHI, and your implementation partner. We sign BAAs for healthcare engagements as table stakes.

Architecture follows the minimum-necessary principle. The agent sees only the fields the workflow requires — an intake agent needs demographics and insurance details, not the full chart. PHI is masked or tokenized before it reaches any general-purpose model where the task allows, retention is zero at the model layer, and processing stays inside your BAA-covered infrastructure. On-prem or VPC-isolated inference is the right call for the most sensitive flows, and it's an option we design for explicitly.

Then the part reviewers actually probe: auditability. Every agent action logged immutably — what data it accessed, what it produced, who reviewed it, when. Human-in-the-loop checkpoints on anything consequential: an intake summary is drafted by the agent and confirmed by staff; a documentation note is proposed, not auto-filed. This maps directly onto the observability layer of the production agent stack — compliance and operability turn out to want the same instrumentation.

Evals carry extra weight in clinical-adjacent settings, because the failure mode isn't a wrong answer — it's a *confidently* wrong answer entering a record. Test sets built from de-identified historical cases, accuracy thresholds gating deploys, and drift monitoring after every vendor model change. This is the operating discipline our Managed AI Operations retainer runs monthly, and in healthcare it doubles as your evidence file.

Where to start: pick the workflow with high volume and bounded risk — intake and front-desk load, not diagnosis. In one deployment, a regional provider network cut intake and admin time 41% with exactly this shape. The two-week Sprint scopes the workflow, the data boundaries, and the BAA chain before any build begins — see the AI for Healthcare page for the full compliance posture, or start with the readiness assessment to see where your foundations stand.

HIPAA-compliant AI agents: what healthcare teams need to know

Questions this raises

Can AI agents be HIPAA compliant?

Do OpenAI and Anthropic sign BAAs?

What healthcare workflows should AI agents automate first?

Related insights

Generative Engine Optimization: how to get cited by ChatGPT, Claude, and Perplexity

Why your AI pilot never reached production — and the five gates that get it there

You don’t have an AI strategy until you have an eval suite

Put a forward-deployed team on it.