AI Safety & Governance Full-time

Staff Red Team Engineer, Safeguards

Anthropic

Location

USA

Type

Full-time

Posted

Jan 25, 2026

Compensation

USD 200000 – 200000

Mission

What you will drive

Core responsibilities:

  • Conduct comprehensive adversarial testing across Anthropic's product surfaces, developing creative attack scenarios that combine multiple exploitation techniques
  • Research and implement novel testing approaches for emerging capabilities, including agent systems, tool use, and new interaction paradigms
  • Design and execute 'full kill chain' attacks that emulate real-world threat actors attempting to achieve specific malicious objectives
  • Build and maintain systematic testing methodologies and automated testing frameworks to enable continuous assessment at scale

Impact

The difference you'll make

This role helps ensure the safety of Anthropic's deployed AI systems and products by taking an adversarial approach to uncover vulnerabilities before they can be exploited by malicious actors, contributing to the creation of reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

Profile

What makes you a great fit

Required qualifications:

  • Demonstrated experience in penetration testing, red teaming, or application security
  • Strong technical skills in web application security, including hands-on expertise with security testing tools (Burp Suite, Metasploit, custom scripting frameworks, etc.)
  • A track record of discovering novel attack vectors and chaining vulnerabilities in creative ways
  • A public body of work such as CVEs, blog posts, or disclosed bug bounty reports
  • Experience with security testing tools and the ability to build custom automation
  • Strong written and verbal communication skills, with the ability to explain technical concepts to varied audiences

Benefits

What's in it for you

Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Annual salary range: $300,000—$320,000 USD.

About

Inside Anthropic

Visit site →

Anthropic's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society as a whole. The team consists of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.