Senior Research Engineer - Voice

Remote Senior

QuantizationLatencyDPODistillationPyTorchDeep learningNLPGPUThroughputTensorRTLLM-as-judgeRLHFTritonTensorFlowMulti-agentOrchestration

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US. As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations. Following our recent Series E funding round, where we raised $200 million, our valuation stands at $4 billion. Our total funding exceeds $530 million from premier investors including Accel, NVentures (Nvidia's VC arm), Kleiner Perkins, GV, and Evantic Capital, alongside the founders and operators of Stripe, Datadog, Miro, and Webflow. WHAT YOU'LL DO AT SYNTHESIA As a Research Engineer you will join a team of 40+ Researchers and Engineers within the R&D Department working on cutting-edge challenges in the Generative AI space, with a focus on creating high-quality, expressive and real-time synthetic voices. Within the team you’ll have the opportunity to work on the applied side of our research efforts and directly impact our solutions that are used worldwide by over 60,000 businesses. If you are an expert in ML, LLMs, speech generation, conversational models, this is your chance to make a global impact. You will join our Audio Post-Training Team, which works on generative speech and voice synthesis, ensuring our in-house voice models reach production-level quality, speed, and robustness. Typical projects include: - Develop and evaluate streaming and speech-to-speech systems, enabling low-latency, interactive voice synthesis. - Adapt models for new conditioning inputs (emotion, speed, prosody, speaker control, etc.). - Implement post-training optimization techniques (quantization, pruning, distillation) to improve efficiency and latency in real-time speech generation. - Integrate and test novel architectures, such as neural codec

Apply on company site →