AI Safety & Governance Full-time

Applied Researcher (Product)

Apollo Research

Posted

Dec 17, 2025

Location

UK

Type

Full-time

Compensation

$135000 - $135000

Mission

What you will drive

Core responsibilities:

  • Systematically collect and catalog coding agent failure modes from real-world instances, public examples, research literature, and theoretical predictions
  • Design and conduct experiments to test monitor effectiveness across different failure modes and agent behaviors
  • Build and maintain evaluation frameworks to measure progress on monitoring capabilities
  • Develop a comprehensive library of monitoring prompts tailored to specific failure modes (e.g., security vulnerabilities, goal misalignment, deceptive behaviors)

Impact

The difference you'll make

This role helps transform complex AI research into practical tools that reduce risks from AI, making AI agent safety accessible at scale for customers and directly impacting real-world AI safety.

Profile

What makes you a great fit

Required skills and qualifications:

  • Passion for using empirical research to make AI systems safer in practice
  • Ability to translate theoretical AI risks into concrete detection mechanisms
  • Experience with rapid iteration and learning from data
  • Knowledge of AI safety, agent failures, and detection methodologies

Benefits

What's in it for you

No specific benefits, compensation, or salary information mentioned in the posting.

About

Inside Apollo Research

Apollo Research is primarily concerned with risks from Loss of Control in AI systems, particularly deceptive alignment/scheming, and works on detection, science, and mitigations of these risks while developing tools to prevent harms from widely deployed AI systems.