← All jobs

Full Stack LLM Engineer

Cerebras · Toronto Office

On-site
GPUPyTorchTensorFlowDeep learningPythonC++Computer visionQuantizationLatencyThroughput

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. This architecture allows Cerebras to deliver industry-leading training and inference speeds; over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation. Cerebras works with the leading model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership https://openai.com/index/cerebras-partnership/ with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. About the Role We are seeking a versatile and experienced engineer to join our Inference Core Model Bringup team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack. Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications. Responsibilities - Contribute to the end-to-end bring up of ML models on Cerebras CSX systems. - Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning. - Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization. - Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups. Skills & Qualifications - Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field. - Comfort navigating the full AI toolchain: Python modeling cod

Apply on company site →