Software Engineer - AI Inference
San Francisco, CA
We are a small team with a shared belief in the future of Intelligence. At Verterra, you won't just be working on Intelligence — you'll be helping define its future.
About the role
We are looking for a Software Engineer - AI Inference to join our team. You will engineer and design the inference pipeline for our AI models.
What you'll work on
- Build a single inference stack that runs our Foundational World Model on datacenter-scale NVIDIA/AMD GPUs and on-device Qualcomm Snapdragon XR silicon, keeping the “10x faster / 40x cheaper” promise the research team already hit in training
- Extend the Emma compiler tool-chain to emit TensorRT, ROCm/MIGraphX and QNN binaries from one IR; own autotuning, quantization and mixed-precision kernels for sub-10 ms latency
- Design multi-tenant serving infrastructure (Triton + Kubernetes + gRPC) that can burst from our on-prem clusters to the public cloud without code changes
- Embed telemetry hooks that turn every user request into new synthetic edge cases, feeding the data fly-wheel and improving the model in real time
- Partner with Kernel, Data and Research teams to benchmark new silicon, push upstream patches and land model-compatible ops within hours of a new architecture drop
What we're looking for
- 5 + years engineering high-performance ML inference or real-time graphics pipelines; expert in modern C++ and Python
- Deep knowledge of CUDA, ROCm/HIP and GPU memory hierarchies; experience writing or extending compiler passes (TVM, XLA, MLIR, or similar)
- Hands-on shipping models to edge devices—Qualcomm DSP/Hexagon, ARM NEON, or Metal/NNAPI—balancing power, thermals and user latency
- Familiar with distributed serving (Triton, Ray, KServe), observability and autoscaling in containerized environments
- Thrives in an ambiguous, fast-moving startup; credentials optional—ability, curiosity and impact mandatory
Compensation, benefits, and perks
- Annual salary: $275K
- 401(k) plan with 6% salary matching
- Generous health, dental and vision insurance for you and your dependents
- Unlimited paid time off
- Visa sponsorship and relocation stipend to bring you to SF, if possible
- A small, fast-paced, highly focused team