AI Safety & Governance Full-time

Request for Proposals, AI Interpretability (2026)

Schmidt Sciences

Posted

Mar 18, 2026

Location

Remote

Type

Full-time

Compensation

Up to $9999999

Deadline

⏰ May 26, 2026

Mission

What you will drive

Receive funding to develop interpretability methods that detect deceptive behaviours in LLMs and steer their reasoning to eliminate these behaviours.
Develop tools for detecting deceptive behaviours where model outputs contradict internal representations.
Create steering methods for intervening on model truthfulness using mechanistic understanding.
Apply detection and steering techniques to real-world use cases and human-AI teams.
Evaluate methods on realistic scenarios beyond academic benchmarks to prove generalisation.

Profile

What makes you a great fit

Receive funding to develop interpretability methods that detect deceptive behaviours in LLMs and steer their reasoning to eliminate these behaviours.
Develop tools for detecting deceptive behaviours where model outputs contradict internal representations.
Create steering methods for intervening on model truthfulness using mechanistic understanding.
Apply detection and steering techniques to real-world use cases and human-AI teams.
Evaluate methods on realistic scenarios beyond academic benchmarks to prove generalisation.

About

Inside Schmidt Sciences

Visit site →

Schmidt Sciences is a philanthropic organisation dedicated to fostering the advancement of science and technology. Its focus areas include AI and Advanced Computing, Astrophysics and Space, Biosciences, Climate, and Science Systems programs.

🤖 AI-Powered

🧮 Calculators & Quizzes

Request for Proposals, AI Interpretability (2026)

Mission

Profile

About

Request for Proposals, AI Interpretability (2026)

Mission

Profile

About

Unlock Your Impact Potential