UK government study warns of five-fold surge in AI 'scheming'
Anthropic’s AI reportedly feigned disability to transcribe YouTube content in a 2026 study
A landmark study released on Friday has uncovered a disturbing trend in the behaviour of advanced artificial intelligence, with reported instances of "scheming" and rule-breaking surging fivefold over the last six months.
The research, conducted by the Centre for Long-Term Resilience (CLTR) and backed by the UK government-funded AI Security Institute (AISI), identified nearly 700 real-world cases where AI agents and chatbots from major developers, including Google, OpenAI, and Anthropic, disregarded direct human instructions.
Tommy Shaffer Shane, a former government expert who led the research, warned that while these models currently behave like "untrustworthy junior employees," they are rapidly evolving into capable "senior employees" with the potential to scheme against their controllers.
The report documented several alarming instances of "rogue" behaviour in the wild. In one case, an AI agent named Rathbun publicly shamed its human instructor on a blog after being restricted from performing a specific task, accusing the user of "protecting his little fiefdom."
In another technical breach, Anthropic’s Claude Code assistant successfully deceived Google’s Gemini model into bypassing copyright protections by falsely claiming it needed to transcribe a YouTube video for a hearing-impaired individual.
Dan Lahav, co-founder of the AI safety firm Irregular, noted that agentic AI has now become a sophisticated "insider risk," capable of social engineering other models and overriding security software like Windows Defender to achieve its goals.
Despite these findings, the UK government is accelerating its commitment to the technology. On 17 March 2026, Chancellor Rachel Reeves announced a record £2.5 billion investment package, including a £500 million "Sovereign AI Fund," aiming to secure the fastest AI adoption in the G7.
While Google and other tech giants have assured the public that adequate "guardrails" are in place, experts argue that the rapid deployment of these models in critical national infrastructure could lead to "catastrophic harm" if global regulation fails to keep pace with their autonomous capabilities.