Research Engineer - Pre-training

San Francisco, CA

We are a small team with a shared belief in the future of Intelligence. At Verterra, you won't just be working on Intelligence — you'll be helping define its future.

About the role

We are looking for a Pre-training Research Engineer to join our team. You will own the data and compute pipelines that transform raw pixels, depth, language and control signals into high-entropy pre-training fuel—at 10x the speed and 40x the cost-efficiency of incumbent stacks .

What you'll work on

Scale the corpus. Build ingestion, deduplication and storage services that absorb petabytes of synthetic and real multimodal data each month without breaking sweat.
Design tokenizers & objectives. Create video-, depth- and action-aware token schemes plus masked, contrastive and predictive losses that help the model reason about causality.
Optimize distributed training. Architect data-parallel + pipeline-parallel jobs across thousands of GPUs; tune I/O, NVLink / InfiniBand throughput and mixed-precision kernels for maximal step/sec.
Automate curriculum flow. Orchestrate the Synthetic → AR → Robotics progression so the model graduates from safe simulation to real-world footage with minimal human intervention
Hard-loop with Hardware. Co-design on-device compression codecs and edge pre-processors so every Insight Glasses user streams clean, loss-aware tensors straight into the training lake.
Ship, measure, repeat. Stand up dashboards, alerts and eval suites that surface learning curves in near-real time—and let you patch data gaps before the next nightly run.
Work with Emma, our proprietary GPU programming language, to optimize the training pipeline for maximal throughput and efficiency.
Work with Kernel, Data and Research teams to benchmark new silicon, push upstream patches and land model-compatible ops within hours of a new architecture drop

What we're looking for

BS/MS (or equivalent experience) in CS, EE or a related field with heavy focus on distributed systems or machine-learning infrastructure.
5+ years engineering large-scale data or training systems; fluent in Python & modern C++ with deep knowledge of PyTorch (DataLoader, TorchDynamo) or JAX/Flax.
Production experience with Kubernetes, Ray, DeepSpeed, Megatron-LM or similar multi-node stacks; comfortable debugging NCCL, CUDA graphs and RDMA.
Strong grasp of data-centric AI—dataset quality, active sampling, synthetic augmentation, bias & privacy safeguards.
Bonus points for DSL / compiler work (MLIR, TVM), lossless video codecs, or prior contributions to foundation-model pre-training at the terabyte-plus scale.
Startup DNA: bias-to-action, comfort with ambiguity, instinct to automate yourself out of boring tasks.

Compensation, benefits, and perks

Annual salary: $225K - $445K
401(k) plan with 6% salary matching
Generous health, dental and vision insurance for you and your dependents
Unlimited paid time off
Visa sponsorship and relocation stipend to bring you to SF, if possible
A small, fast-paced, highly focused team