Impact Careers Full-time

Research Engineer, Interpretability

Anthropic

Posted

Jan 20, 2026

Location

USA

Type

Full-time

Compensation

$315000 - $560000

Mission

What you will drive

  • Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
  • Set up and optimize research workflows to run efficiently and reliably at large scale
  • Build tools and abstractions to support rapid pace of research experimentation
  • Develop and improve tools and infrastructure to support other teams in using Interpretability's work to improve model safety

Impact

The difference you'll make

This role contributes to making AI systems safe and beneficial by developing mechanistic understanding of neural networks, which is essential for building reliable, interpretable, and steerable AI that can be trusted by users and society.

Profile

What makes you a great fit

  • 5-10+ years of experience building software
  • Highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python
  • Some experience contributing to empirical AI research projects
  • Strong ability to prioritize and direct effort toward the most impactful work and comfortable operating with ambiguity and questioning assumptions

Benefits

What's in it for you

Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space for collaboration. Annual salary range: $315,000—$560,000 USD.

About

Inside Anthropic

Visit site →

Anthropic is a frontier AI research and product company, with teams working on alignment, policy, and security. We post specific opportunities at Anthropic that we think may be high impact. We do not necessarily recommend working at other positions at Anthropic. You can read concerns about doing harm by working at a frontier AI company in our career review on the topic.