Application Guide

How to Apply for Software Engineer, ML Platform (ML Serving)

at Zoox

🏢 About Zoox

Zoox is pioneering fully autonomous electric vehicles designed specifically for urban mobility, not retrofitting existing cars. Their unique approach combines vehicle design, AI, and robotics to create purpose-built robotaxis for sustainable transportation. Working here means contributing to cutting-edge technology that could transform cities while reducing carbon emissions and congestion.

About This Role

This role focuses on building and scaling the off-vehicle inference service that powers Zoox's foundational models (LLMs/VLMs) and rider experience models. You'll lead the design and operation of robust ML serving infrastructure, directly impacting how autonomous vehicles process real-time data and make decisions. Your work enables both research innovation and reliable rider experiences through efficient model deployment.

💡 A Day in the Life

You might start by reviewing inference service metrics and addressing any overnight issues, then collaborate with ML researchers to understand requirements for new model deployments. Afternoon could involve designing improvements to the serving infrastructure or mentoring junior engineers on best practices. Throughout the day, you'd be balancing immediate operational needs with strategic infrastructure development.

🎯 Who Zoox Is Looking For

  • Has 4+ years specifically in ML model serving infrastructure, not just general ML engineering
  • Has hands-on experience with GPU-accelerated inference tools like RayServe, vLLM, TensorRT, or Nvidia Triton in production environments
  • Has built large-scale serving systems handling high QPS with low latency requirements
  • Is proficient with AWS cloud infrastructure and Kubernetes for orchestrating ML workloads

📝 Tips for Applying to Zoox

1

Quantify your ML serving experience: mention specific QPS numbers, latency improvements, or scale metrics from previous roles

2

Highlight any experience with real-time or safety-critical systems, which aligns with Zoox's autonomous vehicle focus

3

Demonstrate understanding of both research collaboration (working with ML researchers) and production operations (monitoring, reliability)

4

Show mentorship experience since the role mentions enabling junior engineers' growth

5

Reference specific Zoox technologies or challenges in your application materials to show genuine interest

✉️ What to Emphasize in Your Cover Letter

['Your experience with GPU-accelerated inference tools mentioned in requirements (RayServe, vLLM, TensorRT, etc.)', 'Examples of building reliable, low-latency serving systems for real-time applications', "How you've collaborated with cross-functional teams (ML researchers, data engineers) in previous roles", 'Your approach to mentoring junior engineers and contributing to team growth']

Generate Cover Letter →

🔍 Research Before Applying

To stand out, make sure you've researched:

  • Zoox's specific autonomous vehicle technology stack and how ML models are used in their system
  • The company's approach to safety and reliability in autonomous systems
  • Recent technical blog posts or publications from Zoox engineers about their ML infrastructure
  • How Zoox's robotaxi service operates in current deployment cities

💬 Prepare for These Interview Topics

Based on this role, you may be asked about:

1 Technical deep dive into your experience with specific ML serving tools (RayServe, vLLM, TensorRT, or Triton)
2 How you would design a fault-tolerant inference service for autonomous vehicle applications
3 Scenario: optimizing model serving for both LLMs/VLMs and traditional ML models simultaneously
4 Experience with monitoring and observability for ML serving systems in production
5 Approach to mentoring junior engineers while maintaining high-velocity development
Practice Interview Questions →

⚠️ Common Mistakes to Avoid

  • Focusing only on model training experience without emphasizing serving/inference expertise
  • Being vague about specific tools mentioned in requirements (RayServe, vLLM, TensorRT, Triton)
  • Not demonstrating understanding of production concerns like latency, reliability, and monitoring for ML serving

📅 Application Timeline

This position is open until filled. However, we recommend applying as soon as possible as roles at mission-driven organizations tend to fill quickly.

Typical hiring timeline:

1

Application Review

1-2 weeks

2

Initial Screening

Phone call or written assessment

3

Interviews

1-2 rounds, usually virtual

Offer

Congratulations!

Ready to Apply?

Good luck with your application to Zoox!