AI Safety & Governance
Full-time
Machine Learning Infrastructure Engineer
Gray Swan
Location
Remote, Global
Type
Full-time
Posted
Jan 05, 2022
Mission
What you will drive
- Build robust, scalable ML infrastructure for distributed inference and training that powers an AI security platform.
- Build and scale GPU inference systems with vLLM for high-throughput, low-latency LLM serving.
- Optimize performance through batching, caching, quantization, and hardware-specific strategies to maximize efficiency.
- Create deployment pipelines with automated testing, progressive rollouts, and instant rollbacks.
Impact
The difference you'll make
This role creates positive change by building infrastructure that powers an AI security platform, contributing to safer and more reliable AI systems.
Profile
What makes you a great fit
- Experience building and scaling ML infrastructure for distributed inference and training.
- Proficiency with GPU inference systems and vLLM for LLM serving.
- Skills in performance optimization techniques like batching, caching, and quantization.
- Ability to create deployment pipelines with automated testing and observability systems.
Benefits
What's in it for you
No benefits information provided in the job description.
About
Inside Gray Swan
Gray Swan develops an AI security platform, focusing on building robust and scalable ML infrastructure for AI safety applications.