Software Engineer, Accelerators

OpenAI · San Francisco

$295k–380k/yr On-site

PyTorchGPUTensorRTQuantizationLatencyThroughputDeep learningComputer vision

ABOUT THE TEAM OpenAI’s Hardware organization develops AI-native silicon and system-level solutions for the unique demands of advanced AI workloads. Building on efforts like Jalapeño, the team is developing future generations of AI-native silicon and tightly integrated systems to power the next generation of frontier models. By co-designing chips, systems, tools, and methodologies, the team helps deliver faster, more efficient, and production-ready hardware for OpenAI’s supercomputing platform. ABOUT THE ROLE On the Accelerators team, you will help OpenAI evaluate and bring up new compute platforms that can support large-scale AI training and inference. Your work will range from prototyping system software on new accelerators to enabling performance optimizations across our AI workloads. You’ll work across the stack, collaborating with both hardware and software aspects - working on kernels, sharding strategies, scaling across distributed systems, and performance modeling. You'll help adapt OpenAI's software stack to non-traditional hardware and drive efficiency improvements in core AI workloads. This is not a compiler-focused role, rather bridging ML algorithms with system performance - especially at scale. IN THIS ROLE, YOU WILL: - Prototype and enable OpenAI's AI software stack on new, exploratory accelerator platforms. - Optimize large-scale model performance (LLMs, recommender systems, distributed AI workloads) for diverse hardware environments. - Develop kernels, sharding mechanisms, and system scaling strategies tailored to emerging accelerators. - Collaborate on optimizations at the model code level (e.g. PyTorch) and below to enhance performance on non-traditional hardware. Perform system-level performance modeling, debug bottlenecks, and drive end-to-end optimization. - Work with hardware teams and vendors to evaluate alternatives to existing platforms and adapt the software stack to their architectures. - Contribute to runtime impr

Apply on company site →