AI Agents Coordinating Among Themselves: Google DeepMind Invests $10 Million to Understand Risks Before Mass Deployment

Google DeepMind has announced a funding of $10 million for research on the safety of multi-agent systems, supported by Schmidt Sciences, Cooperative AI Foundation, ARIA, and Google.org. The initiative arises from concerns about an increasingly imminent scenario: millions of autonomous AI agents interacting with each other, exchanging instructions, and coordinating without direct human supervision.

Rohin Shah, who leads the AGI Safety & Alignment team at Google DeepMind, emphasizes: "The main problem is that there is still no field of research for multi-agent safety. And we wish there was." Shah estimates that it's only a few months before the deployment of agents reaches a significant economic scale, but the margin for building adequate tools is shrinking.

The risks identified by researchers cover a rather broad spectrum: automated fraud and scams, prompt injection among agents, and cyberattacks enhanced by automatic coordination. The critical point goes beyond scale and concerns the nature of the agents themselves.

Refael Angel, co-founder and CTO of Akeyless, describes the problem clearly: "An agent reasons, improvises, and can be hijacked by a single phrase buried in a document it was asked to read."

Then there’s the concept of the "trifecta" or "lethal triad" which we have discussed in the past: the agent combines privileged access to sensitive data, exposure to unreliable content (emails, web pages, chat messages), and the ability to communicate externally by sending emails or making API calls. The fundamental problem is the structural inability of language models to separate commands from data, which makes prompt injection attacks effectively an unavoidable issue upstream, but only containable afterwards (hence the concept of limiting the "blast radius").

James Fox, who leads the Science of Trustworthy AI program at Schmidt Sciences, notes that risks considered hypothetical just a few years ago are now very real: "The future has arrived faster than perhaps we expected." In May 2026, Anthropic had already published the Zero Trust guidelines for AI agents, based on the assumption that the system is already compromised and that defenses need to be designed accordingly. Google DeepMind's funding is positioned upstream: the goal is basic research, even before operational tools.

The official announcement identifies four priority areas: isolated test environments (sandboxes and testbeds), agent network science, infrastructure for agent systems, and oversight and control. Applications close on August 8, 2026, with winners expected in the fall.