Applied AI Inference Engineer
Baseten · San Francisco
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $1.5B Series F https://www.baseten.co/blog/announcing-our-series-f/, led by Altimeter Capital, Conviction Partners, and Spark Capital. Join us and help build the platform engineers turn to to ship AI products. THE ROLE As an Applied AI Inference Engineer at Baseten, you will partner directly with customers to architect, build, and deploy high-scale production AI applications on Baseten’s platform. You’ll own the journey with customers from initial exploration to production deployment, translating ambiguous business goals into reliable, observable services with clear quality, latency, and cost outcomes. This role is a great fit for entrepreneurial engineers who want a front-row view into how modern companies adopt AI at scale and who enjoy working across product, software development, performance engineering, and customer-facing implementations. To be clear, this is an engineering role with hands-on coding and software development that also includes aspects of product management, technical customer success, and pre-sales solution engineering mixed in. EXAMPLE INITIATIVES Take a look at these blog posts written by members of our Forward Deployed Engineering team: - Forward Deployed Engineering on the frontier of AI https://www.baseten.co/blog/forward-deployed-engineering/ - The fastest, most accurate Whisper transcription https://www.baseten.co/blog/the-fastest-most-accurate-and-cost-efficient-whisper-transcription/ - Deploy production-ready model servers from Docker images https://www.baseten.co/blog/deploy-production-model-servers-from-docker-images/ - De