← All jobs

Staff Software Engineer, Inference Infrastructure

Cohere · San Francisco

Hybrid Staff
GPULatencyThroughputNLPKubernetesAWSAzureGCPC++DockerDeep learningPythonGoRust

Who are we? Cohere is the leading security-first enterprise AI company. We build cutting-edge foundation AI models and end-to-end products that are designed to solve real-world business problems. We’re training and deploying frontier models for enterprises who are building AI systems. We believe that our work is instrumental to the widespread adoption of AI and we are looking for folks that want to be part of that. We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. Cohere is a team of researchers, engineers, designers, and more, who are all passionate about their craft. We are a global technology company co-headquartered in Toronto and San Francisco, with key offices in London, New York City, Montreal, Seoul, Germany and Paris. Join us! WHY THIS ROLE? Are you energized by building high-performance, scalable and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are looking for Members of Technical Staff to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. In this role, you will work closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments. You will also get the opportunity to interface with customers and create customized deployments to meet their specific needs. You may be a good fit if you have: - 5+ years of engineering experience running production infrastructure at a large scale - Experience designing large, highly available distributed systems with Kubernetes, and GPU workloads on those clusters - Experience with Kubernetes dev and production coding and support - Experience with GCP, Azure, AWS, OCI, multi-cloud on

Apply on company site →