AI Safety & Governance Full-time

Fellows Program, AI Safety

Anthropic

Posted

Apr 10, 2026

Location

Remote (US)

Type

Full-time

Compensation

Up to $200200

Mission

What you will drive

  • This is a 4-month empirical AI safety research fellowship with direct mentorship from Anthropic researchers.
  • Research AI safety areas including scalable oversight, adversarial robustness, and mechanistic interpretability.
  • Collaborate with your mentor to select and execute your empirical research project on Anthropic's priorities.
  • Leverage external infrastructure like open-source models and public APIs for your research work.
  • Produce a public research output such as a paper or technical publication.

Profile

What makes you a great fit

  • This is a 4-month empirical AI safety research fellowship with direct mentorship from Anthropic researchers.
  • Research AI safety areas including scalable oversight, adversarial robustness, and mechanistic interpretability.
  • Collaborate with your mentor to select and execute your empirical research project on Anthropic's priorities.
  • Leverage external infrastructure like open-source models and public APIs for your research work.
  • Produce a public research output such as a paper or technical publication.

About

Inside Anthropic

Visit site →

Anthropic is a frontier AI research and product company, with teams working on alignment, policy, and security. We post specific opportunities at Anthropic that we think may be high impact. We do not necessarily recommend working at other positions at Anthropic. You can read concerns about doing harm by working at a frontier AI company in our career review on the topic.