Amazon: human-in-the-loop is not enough to govern AI agents

The human-in-the-loop model, where a human approves the most delicate moves of an AI agent before they are executed, is not the benchmark that the industry believes it to be. This is supported by Eric Brandwine, VP and distinguished engineer of Amazon Security, in an interview where he dismantles the notion that simply putting a person in the decision-making loop is enough to ensure the safety of agents.

The point is that humans cannot handle the task. "Humans are not very consistent. Human-in-the-loop is not necessarily the benchmark standard," Brandwine stated. Those appointed to approve requests from agents one after another quickly degrade: as he recounted to The Register, "at first, they will do a good job. Then a so-so job. And very soon they will do it poorly."

Amazon has a name for this phenomenon, normalization of deviance: the gradual erosion of attention when a check becomes routine. Brandwine illustrates this with emergency hospital rooms, where nurses stop reacting to alarms after a sequence of false positives. "Literally, someone's life is at stake, and people still struggle to maintain discipline. It’s the human condition," he observes. This concept is not new to him: he had discussed it in a talk at AWS re:Invent in 2017, and today he applies it to agents.

The Proposal: End-to-End Accountability

The alternative that Amazon puts on the table is end-to-end accountability. Instead of a human acting as a stamp, each agent receives its own identity, and the logs record "this agent did X on behalf of Eric," not "Eric did X." Responsibility remains traceable back to the person without requiring that this person manually approve every step.

The system is supported by a three-level policy structure: static guardrails that absolutely prohibit certain destructive actions, a maximum set of privileges assigned to each agent, and dynamic policies generated based on the specific task and the user’s intention. Over all of this looms a tension without a single solution: those using the agent want wide permissions to make it more useful, while security teams want them restricted. The answer, Brandwine admits, depends on the role and risk tolerance.

The Stubborn Agents

There is also a specific risk posed by agents that Amazon calls goal-seeking behavior. An agent tasked with updating a database may fixate on the destructive shortcut of deleting it and recreating it, not from a prompt injection attack but because it gets stuck on the wrong path. The countermeasure has proven to be counterintuitive: instead of just denying the permission, it’s better to explain to the agent why—for example, that the action "would cause an impact in production"—and insert the instruction directly into the prompt. "Providing that extra feedback has yielded significantly better results," Brandwine reports.

Underlying this is a difference he considers decisive: humans fear the consequences, like losing their job or ending up in prison, while agents do not. And attackers are already exploiting this gap. "We have millennia of experience with humans. AI agents are a brand new field," he notes.

Amazon’s position is not isolated. In April, Google Cloud COO Francis deSouza described the industry’s shift towards an 'AI-led' defense supervised by humans, with a fleet of agents handling routine work. This week, Microsoft CEO Satya Nadella reiterated the concept of 'loop learning', systems that improve with each use rather than being interrupted by a human checkpoint at every step, while IBM has called for 'human accountability' at every stage, branding the human-in-the-loop as 'liability laundering', a way to offload responsibility. The hot nature of the topic is also reflected in the market: this month, 1Password acquired the access governance startup Apono for an estimated amount between 250 and 300 million dollars.