AI Agent Ownership Is the Trust Primitive

On May 15, 2026, researchers from Ben-Gurion University of the Negev, Northeastern University, and Amrita Vishwa Vidyapeetham defined agent attribution as the problem of linking an observed AI agent interaction to the responsible account at the hosting vendor.1

The phrase sounds narrow. The problem is not narrow. An agent can spam users, scrape systems, impersonate a support rep, run a cyber task, call tools, spend money, or change infrastructure while the affected party sees behavior but cannot identify the operator who deployed the agent.1

AI agent ownership is the missing trust primitive. Every autonomous action should map to a responsible account, a session, an authority scope, a human or organizational owner, and a stop path. Logs tell you what happened. Ownership tells you who can answer for it.

TL;DR

Agent security cannot stop at tool permissions, runtime hooks, or final-answer evidence. Those controls matter, but they do not answer the accountability question: who owns the running agent?

The new agent-attribution paper proposes a vendor-mediated protocol that uses canaries to connect an observed harmful interaction to a vendor session and account.1 That research targets abuse response and legal accountability. Product teams need a smaller everyday version inside their own systems: every agent run should carry an ownership record that connects account, session, permission scope, tool activity, review path, and kill switch.

Key Takeaways

For agent platform teams: - Treat ownership as a runtime field, not a billing afterthought. - Attach owner, account, session, tool scope, and stop controls to every agent run.

For security teams: - Logs without ownership slow incident response. Ownership without logs weakens evidence. - Require both: an action trace and a responsible account path.

For product teams: - Show users who or what acts on their behalf. - Separate delegated action from delegated accountability.

For policy and trust teams: - Design attribution for authorized response, not casual de-anonymization. - Record enough to stop harm, review abuse, and respect due process.

Ownership Is Not A Profile Name

Most products already show some form of identity. A chat window may show a workspace, a user avatar, a bot name, an API key label, or an organization. That surface can help humans orient themselves, but it does not prove ownership.

Agent ownership needs a stricter contract:

Field Question it answers
Account Which customer, workspace, or vendor account funded the run?
Session Which concrete run produced the action?
Operator Which human, service, or policy delegated the work?
Authority scope Which tools, keys, budgets, and resources could the agent use?
Action trace Which prompts, approvals, tool calls, outputs, and network decisions occurred?
Stop path Who can pause, revoke, throttle, or terminate the run?
Review path Who can investigate after a complaint or alert?

That list looks operational because ownership is operational. A label does not help when an agent sends 2,000 bad messages or hammers a third-party endpoint. The response team needs the session, the account, the authority scope, and the stop path.

Agent Keys Need Risk Budgets covers the authority side: keys should grant narrow, server-enforced capability. Ownership covers the accountability side: every use of that authority should point back to a responsible record.

What The Attribution Paper Adds

The paper formalizes a gap that agent operators will recognize quickly. The victim sees the agent’s behavior. The vendor sees model calls and account logs. Neither party alone sees both views.1

The proposed protocol bridges those views with canaries. An authorized party injects a marker into content the suspect agent is likely to consume. If the agent forwards that content into a vendor-hosted model call, the vendor can search a narrow time window of session logs, find the canary, recover the session, and connect the session to an account.1

The paper separates easy and hard cases:

Setting Practical canary shape Why it works
Non-adversarial misuse Lexical marker, identifier, or unique string The agent passes content to the model without filtering.
Adversarial abuse Utility-bearing lexical or semantic canaries Removing the canary also removes content the agent needs.
Vendor response Narrow time-window log search The vendor searches sessions near the injection time, not all historical logs.

The utility-bearing idea matters. A scam agent may need emotional cues from a target. A web agent may need visible page content. A cyber agent may need filenames and response artifacts. If the operator filters too aggressively, the agent loses task performance along with attribution evidence.1

The paper’s cyber-agent experiment makes that tradeoff concrete. A canary-aware adversary reduced the agent’s solve rate from 51% to 9% when suppression removed task-relevant artifacts. Filenames and visible body identifiers still preserved attribution evidence at 77% and 70% respectively, while semantic canaries reached at least 98% true-positive rate in the adversarial semantic track.1

Those numbers should not become product marketing. The paper tests specific agents, wrappers, and canary families. The lesson should survive: attribution works best when the signal rides on content the agent actually needs.

Logs Are Necessary But Not Sufficient

OpenAI’s Codex safety post describes a mature control shape: bounded execution, approvals, managed network policy, credential storage, rules, managed configuration, and agent-native telemetry.2 The telemetry side includes OpenTelemetry records for user prompts, approval decisions, tool execution results, MCP server usage, and network proxy allow or deny events.2

OpenAI also describes a security-triage workflow that uses Codex logs to inspect the original request, tool activity, approval decisions, tool results, and network-policy decisions around suspicious endpoint alerts.2

That evidence is necessary. It still needs ownership.

A tool trace can say:

Trace evidence Missing ownership question
The agent called a shell tool Which account authorized the run?
The agent hit a network block Which policy owner can review the block?
The agent requested approval Who granted, denied, or delegated approval?
The agent used an MCP server Which workspace configured that server?
The agent produced an output Which operator accepts responsibility for release?

Agent Execution Traces Are the Runtime Contract argues that traces prove the path. Ownership proves the responsible party behind that path. Strong systems need both records joined at the session level.

Codex Shows Why The Problem Is No Longer Theoretical

OpenAI’s May 14 Codex announcement says more than 4 million people use Codex weekly and describes a mobile workflow where users can review outputs, approve commands, change models, start work, and follow screenshots, terminal output, diffs, test results, and approvals from a phone.3 The same announcement says Remote SSH reached general availability, allowing Codex to run threads inside remote machines and managed environments.3

That product shape pushes agent work across devices, machines, threads, approvals, credentials, and local tools. A single agent run may involve a laptop, a phone approval, a remote host, a project, a plugin, a browser, a shell, and a version-control operation.

The ownership record has to travel with the run. Otherwise the system can answer “what command ran?” while losing “who owned the run when the command ran?”

Codex Hooks Make the Harness Real framed hooks, approvals, git custody, evidence, and taste as an operating layer around agent work. Ownership belongs in that same layer. A hook can block a risky action. A trace can explain a completed action. Ownership connects the run to the account and operator who can answer for both outcomes.

The Runtime Ownership Contract

Teams do not need the full canary-attribution protocol for every internal task. They need a first-party ownership contract that makes attribution routine before anything goes wrong.

Start with one record per agent run:

Ownership record field Minimum behavior
run_id Stable ID for the agent session or task.
account_id Customer, workspace, tenant, or organization that owns the run.
operator_id Human, service, scheduled job, or policy that initiated the run.
delegation_source UI click, API call, scheduled rule, mobile approval, or automation token.
authority_bundle Tools, keys, scopes, budgets, writable roots, network policy, and data domains.
approval_events Who approved what, when, and under which policy.
trace_pointer Link to prompts, tool calls, outputs, errors, and network decisions.
stop_controls Pause, revoke, throttle, isolate, or terminate controls.
review_owner Team or queue that receives abuse, safety, security, or quality review.
retention_policy How long the record remains available and who may access it.

The record should sit below the chat transcript and above raw infrastructure logs. Product support can use it. Security can use it. Compliance can use it. Engineering can use it during rollback.

The field names matter less than the invariant: no agent action without a responsible run record.

Ownership Needs Privacy Boundaries

Attribution can become abusive if teams treat it as permission to unmask everyone by default. The ownership paper names that risk directly and frames the protocol around authorized, auditable authorities, policy standing, and legal process.1

Product teams should copy that restraint.

Boundary Product rule
Access Only authorized reviewers can inspect owner records.
Purpose Abuse, safety, security, support, compliance, or incident response only.
Disclosure External disclosure requires policy, process, or legal basis.
Minimization Store enough to stop harm and review the run, not every private detail forever.
Audit Log every ownership lookup and every disclosure.

Ownership should not become casual surveillance. Strong attribution gives victims, platforms, vendors, and operators a response path. Weak governance turns the same primitive into another trust problem.

The design principle is simple: make every agent accountable to the system, and make every ownership lookup accountable to policy.

Where Ownership Fits With Existing Agent Controls

Ownership does not replace the rest of the stack.

OpenAI’s Agents SDK announcement points toward the same layered shape. The SDK gives agents controlled workspaces, file and tool inspection, MCP, skills, AGENTS.md, shell, patching, sandbox execution, and manifest-based workspaces.4 AgentTrust makes a complementary security argument: inspect tool calls before execution and return structured verdicts such as allow, warn, block, or review.5

Those systems decide what the agent can do next. Ownership decides who answers for the run.

Control Job Ownership adds
Scoped keys Limit what the agent can do Which account and operator granted that scope
Runtime hooks Intercept risky actions Which run triggered the hook
Approval gates Add human judgment Who approved which authority expansion
Execution traces Show what happened Who owns the trace and who can act on it
Review packets Package evidence Which owner accepts the result
Model tools Produce typed estimates Which system delegated model authority

AI Agents Should Call Models argues that agents should call trained models instead of inventing estimates. Ownership extends the same discipline to authority. The system should know whether an action came from a human click, an agent session, a model tool, a scheduled automation, or a delegated policy.

That distinction protects users. A user should not have to guess whether an action came from them, from an assistant acting under their account, from an organization policy, or from a compromised automation.

Agents Need Supervision Surfaces covers the user-facing side of that problem. Ownership supplies the record underneath the surface. Review Packets Are the New Final Answer covers the completion artifact. Ownership supplies the party who can accept, reject, or revoke that artifact.

The Decision Rule

Before deploying an agent that can affect other people or external systems, ask one question:

If someone complains about this agent tomorrow, can we identify the run, the account, the authority scope, the approving event, and the person or team who can stop it?

If the answer is no, the agent is not production-ready.

The product may already have logs. It may already have permissions. It may already have prompts that tell the model to behave. Those pieces do not equal ownership until they join into one accountable record.

Agent ownership should become as normal as request IDs, audit logs, and API keys. The work may sound bureaucratic, but the alternative is worse: autonomous systems that can act while nobody can answer for the action.

FAQ

What is AI agent ownership?

AI agent ownership is the runtime record that connects an agent action to the account, session, operator, authority scope, trace, and stop path responsible for the run.

How does agent ownership differ from agent attribution?

Agent ownership is a first-party product contract. The system records ownership before and during a run. Agent attribution solves the harder after-the-fact problem of linking observed harmful behavior to a responsible vendor account when the affected party does not already know the owner.1

Why do logs alone fail?

Logs can show commands, tool calls, approvals, and network decisions. Logs fail when they cannot answer who delegated the run, who owned the authority scope, and who can stop or review the agent.

Should vendors reveal agent owners to anyone who asks?

No. Ownership lookup should require authorized access, policy standing, and audit. External disclosure should require appropriate process. Attribution protects trust only when the lookup path has its own governance.1

What is the minimum production requirement?

Every agent run that can affect external systems should have a run ID, account ID, operator ID, authority bundle, approval record, trace pointer, stop control, review owner, and retention policy.


References


  1. Ruben Chocron, Doron Jonathan Ben Chayim, Eyal Lenga, Gilad Gressel, Alina Oprea, and Yisroel Mirsky, “Who Owns This Agent? Tracing AI Agents Back to Their Owners,” arXiv:2605.16035v1, submitted May 15, 2026. Source for the definition of agent attribution, the vendor-hosted LLM threat model, canary-based attribution protocol, lexical and semantic canary taxonomy, utility-evasion tradeoff, cyber-agent evaluation numbers, bounded-window search property, limitations, and ethical framing around authorized/auditable authorities. 

  2. OpenAI, “Running Codex safely at OpenAI,” OpenAI, May 8, 2026. Source for Codex sandboxing, approvals, managed network policy, identity and credential controls, managed configuration, OpenTelemetry events, Compliance Platform logs, and OpenAI’s use of Codex logs in security triage. 

  3. OpenAI, “Work with Codex from anywhere,” OpenAI, May 14, 2026. Source for Codex weekly usage, mobile control, remote machine connection, live state across threads and approvals, screenshots, terminal output, diffs, test results, Remote SSH general availability, hooks general availability, and programmatic access tokens. 

  4. OpenAI, “The next evolution of the Agents SDK,” OpenAI, April 15, 2026. Source for the Agents SDK model-native agent loop, controlled workspaces, file and tool inspection, MCP, skills, AGENTS.md, shell, apply_patch, native sandbox execution, manifest abstraction, and separation of agent orchestration from compute environments. 

  5. Chenglin Yang, “AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use,” arXiv:2605.04785v1, submitted May 6, 2026. Source for pre-execution tool-call interception, allow/warn/block/review verdicts, shell deobfuscation, RiskChain detection, benchmark scope, and MCP-server integration. 

Related Posts

AI Agent Security: The Deploy-and-Defend Trust Paradox

1 in 8 enterprise AI breaches involve autonomous agents. Runtime hooks, OS-level sandboxes, and drift detection break th…

19 min read

What I Told NIST About AI Agent Security

Production evidence submitted to NIST: AI agent threats are behavioral. 7 failure modes, 3-layer defense, and framework …

14 min read

The Ralph Loop: How I Run Autonomous AI Agents Overnight

I built an autonomous agent system with stop hooks, spawn budgets, and filesystem memory. Here are the failures and what…

11 min read