Fellows Program, AI Safety
Anthropic
Posted
Apr 10, 2026
Location
Remote (US)
Type
Full-time
Compensation
Up to $200200
Mission
What you will drive
- This is a 4-month empirical AI safety research fellowship with direct mentorship from Anthropic researchers.
- Research AI safety areas including scalable oversight, adversarial robustness, and mechanistic interpretability.
- Collaborate with your mentor to select and execute your empirical research project on Anthropic's priorities.
- Leverage external infrastructure like open-source models and public APIs for your research work.
- Produce a public research output such as a paper or technical publication.
Profile
What makes you a great fit
- This is a 4-month empirical AI safety research fellowship with direct mentorship from Anthropic researchers.
- Research AI safety areas including scalable oversight, adversarial robustness, and mechanistic interpretability.
- Collaborate with your mentor to select and execute your empirical research project on Anthropic's priorities.
- Leverage external infrastructure like open-source models and public APIs for your research work.
- Produce a public research output such as a paper or technical publication.
About
Inside Anthropic
Anthropic is a frontier AI research and product company, with teams working on alignment, policy, and security. We post specific opportunities at Anthropic that we think may be high impact. We do not necessarily recommend working at other positions at Anthropic. You can read concerns about doing harm by working at a frontier AI company in our career review on the topic.