OpenAI seeks a safety researcher to prepare for AI that can train itself independently

OpenAI has posted a safety research role paying up to $445,000 to prepare for self-improving AI risks

OpenAI seeks a safety researcher to prepare for AI that can train itself independently

OpenAI has set the goal of building an AI tool capable of researching its own improvements — and is now preparing for the risks that such a development could bring.

The prospect of AI systems achieving so-called "recursive self-improvement" has become a pressing concern for AI leaders, following significant advances in coding tools from OpenAI and Anthropic over the past six months.

This week, Google DeepMind chief executive Demis Hassabis added to the urgency by stating that humanity now stands at the "foothills of the singularity" — the point at which AI begins to improve itself and ultimately surpasses human intelligence.

A new safety role at a significant salary

OpenAI, which is aiming to go public this year, recently posted a job listing seeking a safety researcher to address what happens when an AI can train better versions of itself.

The posting, which appeared this month on job aggregator sites, sits within OpenAI's Preparedness safety team and offers a salary package of between $295,000 and $445,000.

The role calls for "strong technical executors to support preparations for recursive self-improvement."

"This work relies on reasoning about problems that might exist in the future, but might not exist now," the listing states. "So it's especially important that people in this role are tasteful and strategic."

A race to build self-training models

AI models from leading laboratories have advanced at a remarkable pace, as measured by the complexity of tasks they can handle.

Researchers at METR, a laboratory that studies model capabilities, wrote in March that the length of a task frontier AI models can complete doubles approximately every seven months — meaning these systems are increasingly able to perform work that would take human professionals considerable time.

The implication, METR noted, is that AI agents will be capable of handling "a large fraction" of the software work that currently takes human coders days or weeks to complete.

OpenAI is actively pursuing this vision. Its Codex coding tool has become a significant revenue driver, and the company is also seeking to automate its own research operations.

Chief executive Sam Altman said in October that OpenAI had set a goal of running an "automated AI research intern" across hundreds of thousands of chips by this coming September, with a "true automated AI researcher" targeted by March 2028.

"We may totally fail at this goal," Altman wrote on X, "but given the extraordinary potential impacts we think it is in the public interest to be transparent about this."

In April, Anthropic published research on using AI models to oversee more powerful AI models, with promising but limited results.

In May, the company's co-founder and policy head Jack Clark wrote that he believes there is approximately a 60 per cent chance of seeing AI research and development conducted without human involvement by the end of 2028.

Preparing for the risks of self-improving AI

If AI models become capable of training themselves, the theoretical risks are severe — a scenario in which capabilities accelerate rapidly, containment fails and widespread harm follows.

METR chief executive Elizabeth Barnes wrote on Friday that, in her view, "any 'reasonable' civilization would clearly be taking things much more slowly and carefully with AI."

OpenAI's job posting offers a window into how the company is actively preparing for such a future. The role could involve defending OpenAI's models against data poisoning — attempts to corrupt an AI model through its training dataset — as well as developing tools to interpret models' reasoning and experimenting with models to assess their safety and potential dangers.

The posting also notes the researcher could "track progress toward automation of technical staff," including monitoring the use of AI coding tools.

OpenAI's Preparedness team is broadly responsible for preventing severe harms from AI. Other roles listed on the team include positions focused on automated red-teaming to test cybersecurity, as well as addressing biological and chemical risks and agentic AI threats.

"This is urgent, fast-paced work that has far-reaching implications for the company and for society," the Preparedness postings state.