Senior Machine Learning Engineer - Speech / Voice AI (remote)

Develop an in-house voice generation and audio delivery system to enhance accessibility and emotional engagement.
Build a text-to-speech capability that produces natural, empathetic voices for guided exercises and wellbeing content.
Implement multilingual functionality and customizable voice tones to support diverse user needs.
Enable dynamic personalization for user content preferences.
Integrate the audio system with the existing app and backend for real-time playback.
Create an inclusive, emotionally intelligent audio experience to support lasting behavioural wellbeing.

Key Skills:

Strong background in Machine Learning / Deep Learning with hands-on experience in speech or audio processing.
Experience fine-tuning or deploying modern TTS models (e.g. VITS, Bark, or FastSpeech2).
Proficiency in PyTorch (or similar) and comfortable optimizing GPU inference.
Experience deploying ML models to production and integrating via APIs.
Familiarity with AWS, GCP, or Azure for scalable deployment.
Understanding of speaker cloning or emotional prosody control (desirable).
Experience with multilingual TTS or phoneme alignment (desirable).
Interest in ethical AI and accessible, emotionally sensitive applications (desirable).

Salary (Rate): undetermined

City: Manchester

Country: United Kingdom

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Our client is a technology-enabled wellbeing platform that supports neurodiverse users and individuals with disabilities to thrive in education, work, and everyday life. They are looking to develop an in-house voice generation and audio delivery system to enhance accessibility and emotional engagement and searching for the ML Engineer to work remotely that's going to make it happen!

Senior Machine Learning Engineer - Speech / Voice AI (remote)

Contract length: 3-month

IR-35 determination: Outside

Location: Fully remote

Our client is a technology-enabled wellbeing platform that supports neurodiverse users and individuals with disabilities to thrive in education, work, and everyday life. They offer businesses a personal productivity app featuring tools for task breakdown, priority-setting, and structured support to manage anxiety, procrastination, and executive dysfunction. The platform combines tailored learning resources, assistive technology guidance, and mental health content in one accessible space. It serves both students and professionals, helping them build resilience, independence, and sustainable wellbeing through behaviour-change frameworks.

Our Client Is Looking For Someone To Develop an in-house voice generation and audio delivery system to enhance accessibility and emotional engagement.

Build a text-to-speech capability that produces natural, empathetic voices for guided exercises and wellbeing content.
Implement multilingual functionality and customizable voice tones to support diverse user needs.
Enable dynamic personalization so users receive content in voices and styles suited to their preferences.
Integrate the audio system seamlessly with the existing app and backend for real-time playback and consistency across devices.
Create an inclusive, emotionally intelligent audio experience that deepens user connection and supports lasting behavioural wellbeing.

Required Skills

Strong background in Machine Learning / Deep Learning with hands-on experience in speech or audio processing.
Experience fine-tuning or deploying modern TTS models (e.g. VITS, Bark, or FastSpeech2).
Proficiency in PyTorch (or similar) and comfortable optimizing GPU inference.
Experience deploying ML models to production and integrating via APIs.
Familiarity with AWS, GCP, or Azure for scalable deployment.

Desirable

Understanding of speaker cloning or emotional prosody control.
Experience with multilingual TTS or phoneme alignment.
Interest in ethical AI and accessible, emotionally sensitive applications.

This is an exciting opportunity to help shape an inclusive AI experience that brings empathy and accessibility to users around the world.

Robert Walters Operations Limited is an employment business and employment agency and welcomes applications from all candidates

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)