£450 Per day
Undetermined
Remote
london, london
Summary: The Multilingual AI Quality Evaluation Specialist role is a remote, contract position focused on enhancing multilingual AI quality for a leading audio streaming service. The specialist will design evaluation frameworks, calibrate AI tools, and ensure global quality across diverse locales. This position requires a strong background in linguistics and machine learning evaluation methods. The contract duration is six months, emphasizing a high-impact contribution to AI development.
Key Responsibilities:
- Design & Implement: Create advanced multilingual evaluation frameworks and scoring rubrics (e.g., based on MQM, COMET).
- Calibrate AI: Validate and fine-tune AI evaluation tools (like QUAIL/MetricX) against human gold standards.
- Collaborate: Partner with ML Engineers to build and test linguistic data pipelines and synthetic data generation.
- Ensure Global Quality: Analyze model outputs across locales to guarantee accuracy, fluency, and cultural fit for millions of global users.
Key Skills:
- Expertise in LLM Evaluation or Machine Translation evaluation in multilingual settings.
- Hands-on experience with quality frameworks like MQM, COMET, or designing multidimensional rubrics.
- Background in Applied Linguistics, Computational Linguistics, or Language Quality Research.
Salary (Rate): £450 per day
City: London
Country: United Kingdom
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Multilingual AI Quality Evaluation Specialist (Contract and Remote)
Are you a linguistic expert with a passion for cutting-edge AI and data science?
Join a world-leading audio streaming and media service on a high-impact 6-month contract focused on defining the next generation of multilingual AI quality.
This remote role is ideal for a specialist who bridges linguistic nuance with machine learning evaluation methods.
You Will:
- Design & Implement: Create advanced multilingual evaluation frameworks and scoring rubrics (e.g., based on MQM, COMET).
- Calibrate AI: Validate and fine-tune AI evaluation tools (like QUAIL/MetricX) against human gold standards.
- Collaborate: Partner with ML Engineers to build and test linguistic data pipelines and synthetic data generation.
- Ensure Global Quality: Analyze model outputs across locales to guarantee accuracy, fluency, and cultural fit for millions of global users.
Required Experience:
- Expertise in LLM Evaluation or Machine Translation evaluation in multilingual settings.
- Hands-on experience with quality frameworks like MQM, COMET, or designing multidimensional rubrics.
- Background in Applied Linguistics, Computational Linguistics, or Language Quality Research.
Shape the evaluation intelligence layer that underpins a world-class AI ecosystem.
If you find this interesting, please apply here or share your CV to sai saranya. gummadi@ randstad digital .com
Randstad Technologies is acting as an Employment Business in relation to this vacancy.