Other Impact Areas Contract

Senior Data Engineer – Princeton Accelerator

Princeton University, Bridging Divides Initiative

Location

Remote

Type

Contract

Posted

Jan 21, 2026

Mission

What you will drive

Core responsibilities:

  • Optimize existing Databricks pipelines for cost and performance
  • Expand ingestion scope to include YouTube transcripts, multi-media sampling, and new metadata sources
  • Design for ML readiness to create gold-layer datasets for researchers
  • Establish engineering standards, CI/CD pipelines, and documentation practices

Impact

The difference you'll make

The datasets you help build will accelerate research on pressing social science topics relevant to the social media space that shapes our collective information ecosystem, helping researchers understand how platforms shape the information environment.

Profile

What makes you a great fit

Required qualifications:

  • Deep Databricks experience with PySpark, Delta Lake, cost optimization, and cluster tuning
  • Experience designing and building pipelines at scale (10+ TB)
  • Deep experience with CI/CD, testing, and maintainable systems
  • Clear communicator who can work with researchers and junior engineers

Benefits

What's in it for you

Competitive rate commensurate with experience. Remote work structure with ~30 hours/week from January through June 2026.

About

Inside Princeton University, Bridging Divides Initiative

Visit site →

The Accelerator at Princeton's School of Public and International Affairs (SPIA) is building a first-of-its-kind research platform—a living dataset of social media activity that helps researchers understand how platforms shape the information environment.