Cluster Guide
From Software Engineer to Alignment Researcher: A Step-by-Step Guide
You have spent years building production systems, and now you want to work on one of the most consequential research problems of the century: making sure advanced AI systems do what we actually want. The good news is that software engineering experience is a genuine asset. The transition is hard but well-trodden. Here is a concrete plan.
This guide is part of our AI safety career guide, which covers the full landscape of roles and pathways in the field.
Why engineers are well-positioned
Alignment research is not purely theoretical. A large share of the work involves running experiments on real models: probing internal representations, building evaluation harnesses, fine-tuning with RLHF, and stress-testing outputs. Software engineers who can write clean, reproducible ML code are valuable from day one. What you need to add is a research mindset: the ability to formulate hypotheses, design experiments, and write up findings.
Prerequisite skills to build
Before applying to structured programmes, invest in filling specific knowledge gaps:
- ML fundamentals — Complete a rigorous deep-learning course. Andrej Karpathy's "Neural Networks: Zero to Hero" series and fast.ai's Practical Deep Learning are both excellent and free.
- Transformer internals — Understand attention mechanisms, positional encoding, and how large language models are trained. Read "Attention Is All You Need" and implement a small transformer from scratch.
- Interpretability — Study mechanistic interpretability: circuits, superposition, and feature visualisation. Neel Nanda's "200 Concrete Open Problems in Mechanistic Interpretability" is the go-to reference.
- Alignment concepts — Work through the AGI Safety Fundamentals curriculum (BlueDot Impact) to understand RLHF, scalable oversight, goal misgeneralisation, and deceptive alignment.
Structured programmes to apply to
The following programmes are specifically designed to bridge the gap between industry experience and alignment research:
- MATS (ML Alignment Theory Scholars) — A selective programme that pairs scholars with established alignment researchers for a multi-month mentored project. Alumni have gone on to roles at Anthropic, DeepMind, and Redwood Research.
- Redwood Research Residency — A hands-on residency focused on empirical alignment research. Redwood is known for its work on adversarial training and interpretability.
- ARC Evals — The Alignment Research Center's evaluations team works on assessing dangerous capabilities in frontier models. Strong engineering skills are directly applicable here.
- SERI MATS — The Stanford Existential Risks Initiative's branch of the MATS programme, offering additional mentorship pairings and a Bay Area cohort.
A 6-month transition timeline
Here is what a focused transition could look like:
- Month 1: Complete the AGI Safety Fundamentals curriculum. Start reading the Alignment Forum daily.
- Month 2: Finish a deep-learning course. Implement a transformer from scratch in PyTorch and replicate a simple interpretability result.
- Month 3: Choose one open problem from Neel Nanda's list and begin a small independent research project. Write up preliminary findings as a blog post on the Alignment Forum or LessWrong.
- Month 4: Submit applications to MATS, Redwood, or ARC Evals. Attend an EAGx or AI safety unconference to build connections with researchers.
- Month 5: Continue your independent project. Seek feedback from established researchers via cold emails or the AI Safety Camp Slack. Iterate and improve your write-up.
- Month 6: Publish your research. If programme applications are pending, use this time to contribute to an open-source alignment tool or collaborate on a group project.
Key advice
Do not wait until you feel fully qualified. The field moves fast and values demonstrated initiative over credentials. A single well-executed interpretability experiment published on the Alignment Forum will do more for your candidacy than another year of self-study. Start building in public, get feedback early, and apply to programmes even if you feel underprepared.