Research Engineer - Pre-training
San Francisco, CA
We are a small team with a shared belief in the future of Intelligence. At Verterra, you won't just be working on Intelligence — you'll be helping define its future.
About the role
We are looking for a Pre-training Research Engineer to join our team. You will own the data and compute pipelines that transform raw pixels, depth, language and control signals into high-entropy pre-training fuel—at 10x the speed and 40x the cost-efficiency of incumbent stacks .
What you'll work on
- Scale the corpus. Build ingestion, deduplication and storage services that absorb petabytes of synthetic and real multimodal data each month without breaking sweat.
- Design tokenizers & objectives. Create video-, depth- and action-aware token schemes plus masked, contrastive and predictive losses that help the model reason about causality.
- Optimize distributed training. Architect data-parallel + pipeline-parallel jobs across thousands of GPUs; tune I/O, NVLink / InfiniBand throughput and mixed-precision kernels for maximal step/sec.
- Automate curriculum flow. Orchestrate the Synthetic → AR → Robotics progression so the model graduates from safe simulation to real-world footage with minimal human intervention
- Hard-loop with Hardware. Co-design on-device compression codecs and edge pre-processors so every Insight Glasses user streams clean, loss-aware tensors straight into the training lake.
- Ship, measure, repeat. Stand up dashboards, alerts and eval suites that surface learning curves in near-real time—and let you patch data gaps before the next nightly run.
- Work with Emma, our proprietary GPU programming language, to optimize the training pipeline for maximal throughput and efficiency.
- Work with Kernel, Data and Research teams to benchmark new silicon, push upstream patches and land model-compatible ops within hours of a new architecture drop
What we're looking for
- BS/MS (or equivalent experience) in CS, EE or a related field with heavy focus on distributed systems or machine-learning infrastructure.
- 5+ years engineering large-scale data or training systems; fluent in Python & modern C++ with deep knowledge of PyTorch (DataLoader, TorchDynamo) or JAX/Flax.
- Production experience with Kubernetes, Ray, DeepSpeed, Megatron-LM or similar multi-node stacks; comfortable debugging NCCL, CUDA graphs and RDMA.
- Strong grasp of data-centric AI—dataset quality, active sampling, synthetic augmentation, bias & privacy safeguards.
- Bonus points for DSL / compiler work (MLIR, TVM), lossless video codecs, or prior contributions to foundation-model pre-training at the terabyte-plus scale.
- Startup DNA: bias-to-action, comfort with ambiguity, instinct to automate yourself out of boring tasks.
Compensation, benefits, and perks
- Annual salary: $225K - $445K
- 401(k) plan with 6% salary matching
- Generous health, dental and vision insurance for you and your dependents
- Unlimited paid time off
- Visa sponsorship and relocation stipend to bring you to SF, if possible
- A small, fast-paced, highly focused team