AI Safety & Governance Full-time

Machine Learning Infrastructure Engineer

Gray Swan

Location

Remote, Global

Type

Full-time

Posted

Jan 05, 2022

Mission

What you will drive

  • Build robust, scalable ML infrastructure for distributed inference and training that powers an AI security platform.
  • Build and scale GPU inference systems with vLLM for high-throughput, low-latency LLM serving.
  • Optimize performance through batching, caching, quantization, and hardware-specific strategies to maximize efficiency.
  • Create deployment pipelines with automated testing, progressive rollouts, and instant rollbacks.

Impact

The difference you'll make

This role creates positive change by building infrastructure that powers an AI security platform, contributing to safer and more reliable AI systems.

Profile

What makes you a great fit

  • Experience building and scaling ML infrastructure for distributed inference and training.
  • Proficiency with GPU inference systems and vLLM for LLM serving.
  • Skills in performance optimization techniques like batching, caching, and quantization.
  • Ability to create deployment pipelines with automated testing and observability systems.

Benefits

What's in it for you

No benefits information provided in the job description.

About

Inside Gray Swan

Visit site →

Gray Swan develops an AI security platform, focusing on building robust and scalable ML infrastructure for AI safety applications.