← All jobs

Production Engineer, Support tooling (Tooling and Frameworks)

CoreWeave · Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA

On-site
ObservabilityKubernetesPythonGoDockerCI/CDAWSGCPTypeScriptJavaScriptDeep learningNLP

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at  www.coreweave.com . About the role The Senior Production Engineering team sits at the heart of CoreWeave’s reliability efforts. In this role, you’ll partner closely with our Support/CX teams to build, operate, and evolve internal tooling that enables a “Direct‑to‑Expert” support model at scale. You’ll define and ship AI‑assisted workflows, self‑service diagnostics, and platform integrations that reduce time‑to‑resolution and improve customer experience across our cloud. What you’ll do Design, build, and own support-facing tools for case triage, intelligent routing, and expert engagement, integrating with incident and change management workflows. Develop AI‑powered assistants and automations that accelerate root‑cause discovery, knowledge retrieval, and resolution quality. Create and maintain dashboards, alerts, and signals that surface tooling issues early; integrate observability into new tooling to reduce MTTR. Build self-service and guided diagnostics that empower Support/CX to resolve common issues and collect high‑quality context for escalations. Codify reliability and support practices into services, APIs, and Kubernetes-native controllers/operators where appropriate. Partner with engineering leadership and internal stakeholders to prioritize roadmap initiatives, land adoption, and measure business impact. Participate in an on‑call rotation for the tooling you own. What you’ve worked on (Minimum qualifications) 4+ years of software or infrastructure en

Apply on company site →