Policy Manager, Harmful Persuasion
Anthropic
Location
USA
Type
Full-time
Posted
Jan 25, 2026
Compensation
USD 200000 – 200000
Mission
What you will drive
Core responsibilities:
- Develop and maintain comprehensive policy frameworks for harmful persuasion risks, especially in the context of election integrity, influence operations, and fraud
- Design clear, enforceable policy language that can be consistently applied by enforcement teams and translated into technical detection requirements
- Design and oversee execution of evaluations to assess the model's capability to leverage, produce and execute deceptive and harmful persuasive techniques
- Write and refine external-facing Usage Policy language that clearly communicates policy violations and restrictions to users and external stakeholders
Impact
The difference you'll make
This role helps ensure AI systems are not weaponized to undermine civic processes, exploit vulnerable populations, or degrade information ecosystems, directly contributing to the creation of safe and beneficial AI for society.
Profile
What makes you a great fit
Required qualifications:
- 5+ years of experience in policy development, trust & safety policy, or platform policy with working experience across election integrity, fraud/scams, coordinated inauthentic behavior, influence operations, or misinformation
- General knowledge of the global regulatory landscape around election integrity, platform regulation, and digital services accountability
- Strong policy writing skills with the ability to translate complex risk frameworks into clear, enforceable guidelines
- Experience designing policies and workflows that enable both clear human enforcement decision-making and technical implementation in ML classifiers and detection pipelines
Benefits
What's in it for you
Competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space for collaboration. Annual salary range: $245,000—$330,000 USD.
About
Inside Anthropic
Anthropic's mission is to create reliable, interpretable, and steerable AI systems, aiming for AI to be safe and beneficial for users and society as a whole.