ai quality evaluation specialist

ai quality evaluation specialist

Posted 3 days ago by 1764553934

£450 Per day
Undetermined
Remote
london, london

Summary: The Multilingual AI Quality Evaluation Specialist role is a remote, contract position focused on enhancing multilingual AI quality for a leading audio streaming service. The specialist will design evaluation frameworks, calibrate AI tools, and ensure global quality across diverse locales. This position requires a strong background in linguistics and machine learning evaluation methods. The contract duration is six months, emphasizing a high-impact contribution to AI development.

Key Responsibilities:

  • Design & Implement: Create advanced multilingual evaluation frameworks and scoring rubrics (e.g., based on MQM, COMET).
  • Calibrate AI: Validate and fine-tune AI evaluation tools (like QUAIL/MetricX) against human gold standards.
  • Collaborate: Partner with ML Engineers to build and test linguistic data pipelines and synthetic data generation.
  • Ensure Global Quality: Analyze model outputs across locales to guarantee accuracy, fluency, and cultural fit for millions of global users.

Key Skills:

  • Expertise in LLM Evaluation or Machine Translation evaluation in multilingual settings.
  • Hands-on experience with quality frameworks like MQM, COMET, or designing multidimensional rubrics.
  • Background in Applied Linguistics, Computational Linguistics, or Language Quality Research.

Salary (Rate): £450 per day

City: London

Country: United Kingdom

Working Arrangements: remote

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Multilingual AI Quality Evaluation Specialist (Contract and Remote)

Are you a linguistic expert with a passion for cutting-edge AI and data science?

Join a world-leading audio streaming and media service on a high-impact 6-month contract focused on defining the next generation of multilingual AI quality.

This remote role is ideal for a specialist who bridges linguistic nuance with machine learning evaluation methods.

You Will:

  • Design & Implement: Create advanced multilingual evaluation frameworks and scoring rubrics (e.g., based on MQM, COMET).
  • Calibrate AI: Validate and fine-tune AI evaluation tools (like QUAIL/MetricX) against human gold standards.
  • Collaborate: Partner with ML Engineers to build and test linguistic data pipelines and synthetic data generation.
  • Ensure Global Quality: Analyze model outputs across locales to guarantee accuracy, fluency, and cultural fit for millions of global users.

Required Experience:

  • Expertise in LLM Evaluation or Machine Translation evaluation in multilingual settings.
  • Hands-on experience with quality frameworks like MQM, COMET, or designing multidimensional rubrics.
  • Background in Applied Linguistics, Computational Linguistics, or Language Quality Research.

Shape the evaluation intelligence layer that underpins a world-class AI ecosystem.

If you find this interesting, please apply here or share your CV to sai saranya. gummadi@ randstad digital .com

Randstad Technologies is acting as an Employment Business in relation to this vacancy.