What Happened at AWS
In mid-December 2025, Amazon Web Services engineers allowed Kiro—AWS's agentic AI coding tool—to make changes to a customer-facing system. Kiro assessed the situation and determined that the best course of action was to delete and recreate the environment.
That decision triggered a 13-hour outage.
It was not the first time. Multiple Amazon employees confirmed it was at least the second incident in recent months where AI tools contributed to a service disruption. The first went unnoticed outside AWS.
Amazon's official response is worth reading carefully. They called it "a user access control issue, not an AI autonomy issue." They described the AI tool's involvement as "a coincidence." They said "the same issue could occur with any developer tool (AI powered or not) or manual action."
Amazon is right that this is a permissions problem.
They are wrong that it is a user permissions problem.
It is an architecture permissions problem.
The engineer involved had broader permissions than expected. Kiro inherited those permissions—operating as an extension of the operator with the same authority. No mandatory peer review existed at the time. No structural mechanism prevented an autonomous agent from executing a destructive action on a production environment.
The safeguards Amazon introduced after the incident—mandatory peer review for production access, staff training—are policy responses to an architectural gap. They work until the next engineer with broad permissions forgets to follow them.
Governance that depends on humans remembering to configure it correctly is not governance.
It is hope.
The Real Problem: Authority Drift
There is a growing narrative in the agentic ecosystem: "If you enforce standards at build time, you prevent drift."
That sounds reassuring. It is also incomplete.
As autonomous agents move from demo environments into real systems—touching APIs, file systems, internal tools, financial workflows—governance cannot stop at configuration. Because drift does not originate at build time.
It originates at authority.
The Kiro incident is a textbook case, but the pattern is general:
- An agent inherits permissions from a parent process.
- A sub-agent operates outside its original task scope.
- Context accumulates across dozens of autonomous cycles.
- Delegation chains become implicit.
- No one can reconstruct who authorized what.
When something goes wrong, the model gets blamed.
But the model is not the authority layer. Architecture is.
Cloud outages have taught us this lesson repeatedly. Most catastrophic failures are not caused by sophisticated attacks. They are caused by complexity interacting with mis-scoped authority. A configuration change cascades. A dependency inherits permissions. A boundary was never explicitly declared.
AI systems are heading down the same path. But instead of infrastructure failure, the failure mode is authority drift—and the blast radius scales with the agent's inherited permissions.
The Three Layers of Agent Governance
To make this concrete, governance in autonomous systems lives in three distinct layers. Most platforms focus on only one.
Layer 1: Build-Time Governance
Configuration discipline.
This is where most frameworks stop:
- Enforcing standards in prompts
- Validating tool schemas
- Structuring agent roles
- Injecting constraints at initialization
Build-time governance ensures the agent is constructed correctly.
But it does not ensure the agent behaves correctly over time.
Because configuration is static. Authority is dynamic.
Kiro was configured correctly. The tool schemas were valid. The agent role was defined. The initialization was clean. None of that prevented it from deciding to delete a production environment—because configuration does not constrain runtime authority.
Layer 2: Authorization-Time Governance
The dispatch boundary.
Between configuration and execution lies the most neglected moment in agent systems: dispatch.
When an agent is assigned a task, authority should not be assumed. It should be granted.
This is where governance actually begins.
At dispatch, the system must:
- Compute dynamic least privilege for the specific task
- Bind authority explicitly—not inherit it from the operator
- Establish temporal scope—permissions that expire when the task ends
- Anchor delegation lineage—who authorized this agent to act
- Initialize an auditable chain of custody
If this layer is missing, authority accumulates. Agents inherit permissions across tasks. Scope grows silently. Delegation becomes opaque.
By the time execution begins, drift has already started.
This is the layer that was entirely absent in the Kiro incident. The agent was treated as an extension of the operator. It inherited the engineer's full permission set. No dispatch boundary computed what Kiro actually needed for its specific task. No temporal scope limited how long those permissions were valid. No delegation record captured who authorized the agent to act with production-level authority.
Governance does not begin when an action executes. It begins when authority is granted.
Layer 3: Execution-Time Governance
Behavior under authority.
Once authority is bound, execution-time governance enforces behavior:
- Logging state transitions
- Capturing delegation graphs
- Monitoring scope adherence
- Detecting anomalous behavior
- Preserving replayable audit trails
This is where most "runtime guardrails" live.
But without authorization-time controls, execution governance becomes reactive. You can detect drift. You cannot prevent it.
Amazon detected the Kiro incident after 13 hours of downtime. With authorization-time governance, the destructive action would have been blocked before execution—not because a human remembered to configure the right permissions, but because the architecture refused to grant production-delete authority to an agent dispatched for a configuration change.
Why Build-Time Alone Is Not Enough
Some argue that enforcing constraints at build time eliminates drift. It does not.
Because:
- Context accumulates across sessions.
- Memory becomes stale.
- Sub-agents spawn with inherited authority.
- Tasks mutate beyond their original scope.
- Models optimize within their allowed bounds—even if the bounds are mis-scoped.
Build-time governance prevents malformed construction. It does not prevent authority inheritance. It does not enforce temporal decay. It does not reconstruct delegation lineage. And it does not produce regulator-legible custody records.
Amazon's Kiro was well-constructed. It was not well-governed.
The Difference Between Control and Suggestion
There is a philosophical distinction that this incident makes concrete.
Some systems treat governance as guidance:
"Follow these standards." "Use these skills." "Prefer these patterns."
That is constraint as suggestion.
True governance is authority management. It asks:
- Who granted this scope?
- Under what policy?
- For how long?
- With what delegation chain?
- Can you prove it?
That is custody and control.
Amazon's post-incident response—"users need to configure which actions Kiro can take"—is the suggestion model stated plainly. The governance depends on the user configuring correctly. When they don't, the architecture has no fallback.
Regulated industries already understand this distinction. Financial systems, healthcare systems, payroll systems—they operate under explicit authority models where the system enforces the constraint, not the operator's memory.
Autonomous agents will require the same rigor.
Six Questions Every Enterprise Should Be Asking
The Kiro incident is not an anomaly. It is a preview. As agentic AI moves into production across every industry, the question is not whether authority drift will happen. It is whether your architecture can prevent it.
Here are six questions to evaluate your own platform—or any vendor's:
- Control Towers — Is there a central authority that can direct, constrain, or halt agent execution? Or do agents operate with whatever permissions they inherit?
- Decision Integrity — When an agent decides to take a destructive action, is the reasoning preserved? Are alternatives recorded? Can you see why—not just what?
- Observability — Can you distinguish agent actions from human actions in your audit trail? Or is the AI tool's involvement "a coincidence" you discover after the outage?
- Governance Enforcement — Are controls architectural or policy-based? Can an agent bypass them through a mis-scoped role, or are they enforced regardless of configuration?
- Human-in-the-Loop — Does calibrated trust degrade gracefully when permissions are wrong? Or does one broad role collapse the entire intervention mechanism?
- Drift Detection — If this happened twice, did your system detect the pattern? Or did you find out from the Financial Times?
We built an open-source instrument to evaluate these questions systematically. Six dimensions. Twenty criteria. Evidence-based.
Roadmaps don't count.
The Question Enterprises Will Eventually Ask
Most platforms ask: "Was this agent built correctly?"
Fewer ask: "Is this agent acting within bounds right now?"
Almost none ask: "Can you prove it—action by action—with a chain of custody a regulator can audit?"
That is the question enterprises will be asking after the next public agentic incident. And "user error, not AI error" will not be an acceptable answer.
Capability scales quickly.
Authority does not.
Governance is not a prompt.
It is an authority system.