Negotiable
Undetermined
Remote
Remote
Summary: The role of ML/DQ Scientist involves leveraging machine learning techniques to enhance data quality programs by developing models for anomaly detection, drift monitoring, and pattern recognition. The position requires expertise in Python and MLflow, with a focus on integrating machine learning signals into existing data quality frameworks. The candidate will also be responsible for monitoring model performance and communicating findings to the data quality team. This is a long-term remote position aimed at improving data integrity through advanced analytics.
Key Responsibilities:
- Design and deploy anomaly detection models for numerical, categorical, and time-series data
- Implement statistical drift monitoring across pipeline runs and data partitions
- Build ML-based completeness prediction and consistency check models
- Integrate ML DQ signals into the broader DQ alerting framework
- Monitor model performance, retrain on new data patterns, and manage model lifecycle
- Document model behaviour and communicate anomaly signals to the DQ team
Key Skills:
- 4+ years in data science or ML engineering, with production model experience
- Proficient in Python, PySpark, and MLflow on Databricks
- Experience with anomaly detection, statistical process control, or data drift frameworks
- Familiarity with feature stores and MLOps practices
- Ability to explain model outputs to non-technical stakeholders
Salary (Rate): undetermined
City: undetermined
Country: undetermined
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Kindly find the below requirement and share your interest.
Position :ML/DQ Scientist
Location : Remote
Duration : Longterm
ML / DQ Scientist | ML / AI
|
Brings machine learning capabilities to the DQ programme. Builds anomaly detection, drift monitoring, and pattern-based models to catch data quality issues that rule-based checks miss.
Python / MLflow | Anomaly Detection | Databricks ML | Statistical Drift |
Key Responsibilities
Design and deploy anomaly detection models for numerical, categorical, and time-series data
Implement statistical drift monitoring across pipeline runs and data partitions
Build ML-based completeness prediction and consistency check models
Integrate ML DQ signals into the broader DQ alerting framework
Monitor model performance, retrain on new data patterns, and manage model lifecycle
Document model behaviour and communicate anomaly signals to the DQ team
Requirements
4+ years in data science or ML engineering, with production model experience
Proficient in Python, PySpark, and MLflow on Databricks
Experience with anomaly detection, statistical process control, or data drift frameworks
Familiarity with feature stores and MLOps practices
Ability to explain model outputs to non-technical stakeholders