Energy Full-time

Senior Software Engineer, Observability

Crusoe

Posted

Feb 27, 2026

Location

Remote

Type

Full-time

Mission

What you will drive

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

About This Role:

We’re seeking a Senior Software Engineer to play a key role on our Observability team within the Cloud Infrastructure organization. This team owns the real-time observability platforms that underpin visibility, reliability, and operational insight across our cloud and data center infrastructure.

What You’ll Be Working On:

  • Maintain and manage core observability tools, including platforms for metrics, events, logs and tracing.

  • Develop and operate data pipelines to move telemetry data from various sources to backend storage.

  • Manage large-scale data ingestion and storage requirements for high-volume environments.

  • Perform regular updates and software enhancements to ensure system stability and security.

  • Participate in a standard on-call rotation to address production issues and perform root cause analysis.

  • Work with other engineering teams to implement monitoring best practices and standardized tooling.

  • Contribute to the long-term technical roadmap for the company's internal infrastructure.

What You’ll Bring to the Team:

  • 5+ years of experience in software or systems engineering.

  • Proficiency in Java or Go or Python for writing production-level code.

  • Practical experience managing Kubernetes clusters in a production environment.

  • Experience deploying and managing services using Helm and YAML-based configurations.

  • Ability to troubleshoot and resolve issues within distributed system architectures.

  • Experience participating in an on-call rotation for business-critical systems.

Bonus Points:

  • Experience with common observability tools such as Prometheus, Grafana, Loki, ClickHouse or Elasticsearch.

  • Familiarity with Kafka or similar message queuing systems.

  • Experience using Terraform for infrastructure provisioning.

  • Knowledge of OpenTelemetry standards.

  • Familiarity with GPU-based infrastructure or machine learning workloads.

Benefits:

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts 

  • Paid Parental Leave 

  • Paid life insurance, short-term and long-term disability 

  • Teladoc 

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $300/month

Compensation Range

Compensation will be paid in the range of up to $172,000 -$209,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Profile

What makes you a great fit

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

About This Role:

We’re seeking a Senior Software Engineer to play a key role on our Observability team within the Cloud Infrastructure organization. This team owns the real-time observability platforms that underpin visibility, reliability, and operational insight across our cloud and data center infrastructure.

What You’ll Be Working On:

  • Maintain and manage core observability tools, including platforms for metrics, events, logs and tracing.

  • Develop and operate data pipelines to move telemetry data from various sources to backend storage.

  • Manage large-scale data ingestion and storage requirements for high-volume environments.

  • Perform regular updates and software enhancements to ensure system stability and security.

  • Participate in a standard on-call rotation to address production issues and perform root cause analysis.

  • Work with other engineering teams to implement monitoring best practices and standardized tooling.

  • Contribute to the long-term technical roadmap for the company's internal infrastructure.

What You’ll Bring to the Team:

  • 5+ years of experience in software or systems engineering.

  • Proficiency in Java or Go or Python for writing production-level code.

  • Practical experience managing Kubernetes clusters in a production environment.

  • Experience deploying and managing services using Helm and YAML-based configurations.

  • Ability to troubleshoot and resolve issues within distributed system architectures.

  • Experience participating in an on-call rotation for business-critical systems.

Bonus Points:

  • Experience with common observability tools such as Prometheus, Grafana, Loki, ClickHouse or Elasticsearch.

  • Familiarity with Kafka or similar message queuing systems.

  • Experience using Terraform for infrastructure provisioning.

  • Knowledge of OpenTelemetry standards.

  • Familiarity with GPU-based infrastructure or machine learning workloads.

Benefits:

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts 

  • Paid Parental Leave 

  • Paid life insurance, short-term and long-term disability 

  • Teladoc 

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $300/month

Compensation Range

Compensation will be paid in the range of up to $172,000 -$209,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

About

Inside Crusoe

Transforming stranded energy into eco-friendly power for data centers, reducing environmental impact significantly.