Research Scientist - Speech
Company: Zoom
Location: Seattle
Posted on: April 5, 2026
|
|
|
Job Description:
What you can expect We are looking for a Research Scientist with
a solid background in speech recognition, speech synthesis, and
speech processing. You will build advanced speech understanding
models on large-scale datasets, transforming speech into human- and
LLM-readable text to fulfill Zoom’s vision of seamless
conversation-to-task completion. This role will also have you
collaborating with cross-functional teams, including product,
science and engineering teams, to deliver high-impact projects from
the ground up. About the Team Zoom's AI Speech Team is developing
speech technologies to improve Zoom's conversational AI experience.
Our work contributes to Zoom AI Companion, Zoom Meetings, Zoom
Contact Center, Zoom Phone, and Zoom Revenue Accelerator. You will
develop novel solutions in automatic speech recognition (ASR),
text-to-speech (TTS), voice agents, speech-to-speech translation,
and speech-focused large language models (LLMs) to transform
conversations into actionable tasks for users worldwide.
Responsibilities Developing state-of-the-art speech understanding
models on large-scale datasets for Zoom products, including ASR,
TTS, voice agents, speech-to-speech translation, and speech LLMs.
Devising novel techniques where off-the-shelf solutions are not
available. Demonstrating technical judgment in the entire
development cycles, including data collection, model prototyping,
training, optimization, and evaluation. Collaborating with
cross-functional teams, including product and science engineering
teams, to deliver high-impact projects from the ground up.
Mentoring and provide technical guidance to junior team members.
What we’re looking for Possess a Master's degree in Computer
Science, Electrical Engineering or related fields with 5 years of
experience. Display knowledge in deep learning and hands-on
programming skills in Python, shell scripts; familiarity with ML
frameworks such as PyTorch and TensorFlow. Demonstrate experience
in speech recognition, speech processing, natural language
processing or related fields in academic research or industry
settings. Have domain expertise in one or more of the following
areas: modern end-to-end ASR architectures, TTS and voice cloning,
voice agents and conversational AI, speech-to-speech translation,
speech LLMs, language modeling, decoding algorithms,
personalization and adaptation, semi-/self-supervised learning,
multilingual and robust systems, LLM-integrative speech models.
Demonstrate experience with speech toolkits and libraries such as
Kaldi/k2, ESPNet, NeMo, TorchAudio, SpeechBrain or similar
frameworks. Have experience with large scale data processing and
model training. Salary Range or On Target Earnings: Minimum:
$177,100.00 Maximum: $387,500.00 In addition to the base salary
and/or OTE listed Zoom has a Total Direct Compensation philosophy
that takes into consideration; base salary, bonus and equity value.
Note: Starting pay will be based on a number of factors and
commensurate with qualifications & experience. We also have a
location based compensation structure; there may be a different
range for candidates in this and other locations At Zoom, we offer
a window of at least 5 days for you to apply because we believe in
giving you every opportunity. Below is the potential closing date,
just in case you want to mark it on your calendar. We look forward
to receiving your application! Anticipated Position Close Date:
04/08/26 Ways of Working Our structured hybrid approach is centered
around our offices and remote work environments. The work style of
each role, Hybrid, Remote, or In-Person is indicated in the job
description/posting. Benefits As part of our award-winning
workplace culture and commitment to delivering happiness, our
benefits program offers a variety of perks, benefits, and options
to help employees maintain their physical, mental, emotional, and
financial health; support work-life balance; and contribute to
their community in meaningful ways. Click Learn for more
information. About Us Zoomies help people stay connected so they
can get more done together. We set out to build the best
collaboration platform for the enterprise, and today help people
communicate better with products like Zoom Contact Center, Zoom
Phone, Zoom Events, Zoom Apps, Zoom Rooms, and Zoom Webinars. We’re
problem-solvers, working at a fast pace to design solutions with
our customers and users in mind. Find room to grow with
opportunities to stretch your skills and advance your career in a
collaborative, growth-focused environment. Our Commitment? At Zoom,
we believe great work happens when people feel supported and
empowered. We’re committed to fair hiring practices that ensure
every candidate is evaluated based on skills, experience, and
potential. If you require an accommodation during the hiring
process, let us know—we’re here to support you at every step. If
you need assistance navigating the interview process due to a
medical disability, please submit an Accommodations Request Form
and someone from our team will reach out soon. This form is solely
for applicants who require an accommodation due to a qualifying
medical disability. Non-accommodation-related requests, such as
application follow-ups or technical issues, will not be
addressed.
Keywords: Zoom, Sammamish , Research Scientist - Speech, IT / Software / Systems , Seattle, Washington