← All jobs

Senior Machine Learning Engineer

Cresta · Canada (Remote)

Remote Senior
Hugging FaceRAGEmbeddingsMulti-agentOrchestrationLatencyPyTorchTensorFlowNLPVertex AIRerankingTool useLLM-as-judgeEval harnessesThroughput

Cresta unlocks the true potential of the customer experience, turning every conversation into a competitive advantage. Cresta’s unified AI platform combines conversational AI agents, real-time human agent augmentation, and comprehensive conversation intelligence to drive revenue and efficiency gains across every channel. The world’s leading companies, including United Airlines, Cox Communications, and Marriott, use Cresta to power world-class customer experiences every day.  Born from the Stanford AI Lab, Cresta has raised more than $270 million from the world’s leading investors, including a16z, Greylock, and Sequoia. Cresta’s leadership includes some of the leading minds in AI today. Our CEO, Ping Wu , founded and led Google's Contact Center AI and Vertex AI platforms before joining Cresta to build the future of AI-driven customer experiences. Over the next few years, AI is going to redefine how people all over the world interact with businesses every day. Come build that future at Cresta. About the role: Machine Learning Engineers at Cresta work across several high-impact AI initiatives. Final team placement is determined based on experience, strengths, and business needs. Current focus areas include: Agentic Assist:  Lead and build next-generation agentic AI systems that augment contact center agents in real time. This track requires strong pre-LLM ML foundations, deep expertise in LLMs and modern prompting techniques, a rapid prototyping mindset, and a proven ability to translate cutting-edge research into scalable, production-grade systems. Agent & System Quality:  Design evaluation frameworks and improve the reliability, robustness, and performance of LLM-powered agents. This includes diagnosing and mitigating failure modes such as hallucinations, retrieval errors, tool misuse, context drift, prompt brittleness, and multi-step reasoning breakdowns, while defining measurable quality metrics (e.g., accuracy, faithfulness, task completion, lat

Apply on company site →