Negotiable
Undetermined
Undetermined
LONDON
Summary: We are seeking an experienced MLOps Engineer to focus on the deployment, monitoring, and maintenance of machine learning models in production environments. This role is essential for ensuring the reliability and performance of ML platforms, managing API endpoints, and overseeing model deployment workflows. The successful candidate will not be involved in model development or end-user support but will play a critical role in seamless integration and scalability.
Key Responsibilities:
- Monitor ML model endpoints and overall platform health using tools like Grafana and Domino Data Lab.
- Respond to incidents and alerts, perform code fixes, manage incidents internally and manage changes through ServiceNow.
- Interface directly with Domino Data Lab support to resolve model platform-related issues.
- Deploy and maintain ML models in production environments.
- Ensure models are properly integrated into automated pipelines and meet standards.
- Collaborate with data scientists and engineers to ensure smooth handoff from model development to production.
- Maintain and support ML pipelines, ensuring stability and scalability.
- Continuously optimize pipeline performance, resource usage, and automation.
- Implement automation for deployment and monitoring tasks.
- Contribute to platform improvements.
Key Skills:
- Extensive experience in Python programming.
- Strong experience with ML model deployment and production monitoring.
- Working knowledge of core data science concepts, such as model evaluation metrics, overfitting, data drift, and feature importance.
- Proficiency in AWS services (like S3, RedShift etc).
- Experience with Grafana for monitoring and alerting.
- Good to have hands-on experience with Domino Data Lab platform.
- Solid understanding of CI/CD pipelines, version control, containerization, and orchestration.
- Ability to communicate effectively with internal and external stakeholders.
- Excellent troubleshooting and incident management skills.
Salary (Rate): undetermined
City: LONDON
Country: undetermined
Working Arrangements: undetermined
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Role Summary
We are seeking an experienced MLOps Engineer to join our team, focusing on the deployment, monitoring, and maintenance of machine learning models in production environments. This role does not involve model development or end-user support but is critical to ensuring the reliability and performance of our ML platforms. The successful candidate will also be responsible for managing API endpoints and overseeing model deployment workflows to ensure seamless integration and scalability.
Key Responsibilities
Platform Operations & Monitoring
- Monitor ML model endpoints and overall platform health using tools like Grafana and Domino Data Lab.
- Respond to incidents and alerts, perform code fixes, manage incidents internally and manages changes through ServiceNow
- Interface directly with Domino Data Lab support to resolve model platform-related issues.
Model Deployment into Production
- Deploy and Maintain ML models in production environments.
- Ensure models are properly integrated into automated pipelines and meet standards.
Pipeline Maintenance
- Collaborate with data scientists and engineers to ensure smooth handoff from model development to production.
- Maintain and support ML pipelines, ensuring stability and scalability.
- Continuously optimize pipeline performance, resource usage, and automation
Automation & Tooling
- Implement automation for deployment and monitoring tasks.
- Contribute to platform improvements.
Required Skills & Experience
- Extensive experience in Python programming
- Strong experience with ML model deployment and production monitoring.
- Working knowledge of core data science concepts, such as model evaluation metrics, overfitting, data drift, and feature importance.
- Proficiency in AWS services (like S3, RedShift etc)
- Experience with Grafana for monitoring and alerting.
- Good to have hands-on experience with Domino Data Lab platform.
- Solid understanding of CI/CD pipelines, version control, containerization, and orchestration. Ability to communicate effectively with internal and external stakeholders.
- Excellent troubleshooting and incident management skills.