Cloud ML Engineer  Vertex AI & DevOps Automation

Cloud ML Engineer Vertex AI & DevOps Automation

Posted 1 week ago by 1754633226

Negotiable
Outside
Hybrid
USA

Summary: The role of MLOps Engineer focuses on leveraging Google Cloud Platform and Vertex AI to build and maintain scalable machine learning infrastructure. The engineer will automate workflows and facilitate AI/ML deployments in production environments. Collaboration with ML engineers and data scientists is essential for productionizing models and managing their lifecycle. The position requires extensive experience in DevOps/MLOps and cloud ML engineering.

Key Responsibilities:

  • Develop, automate, and manage ML pipelines using Vertex AI Pipelines, Kubeflow, and Cloud Composer
  • Deploy and monitor models in production using Vertex AI and CI/CD workflows (Cloud Build, GitHub Actions, etc.)
  • Work closely with ML engineers and data scientists to productionize models and manage model versioning, retraining, and rollback strategies
  • Manage infrastructure-as-code using Terraform, Deployment Manager, or similar tools
  • Implement observability and monitoring (logging, metrics, alerts) using Cloud Monitoring, Prometheus, or Grafana
  • Ensure security, governance, and compliance of ML workflows within the Google Cloud Platform ecosystem
  • Optimize cost, performance, and scalability of ML systems in production

Key Skills:

  • 5+ years in DevOps/MLOps or Cloud ML Engineering, with recent Google Cloud Platform production experience
  • Strong hands-on experience with Vertex AI, Cloud Functions, BigQuery, and GCS
  • Proficiency with tools like TFX, Kubeflow, Docker, and Kubernetes (GKE preferred)
  • Expertise in CI/CD, GitOps, and workflow orchestration
  • Programming skills in Python (ML workflows) and Bash/Terraform (infra scripting)
  • Solid understanding of model lifecycle, pipeline automation, and ML monitoring
  • Bachelor's or Master's in Computer Science, Data Engineering, or related field

Salary (Rate): undetermined

City: Atlanta

Country: USA

Working Arrangements: hybrid

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

We are hiring an experienced MLOps Engineer with hands-on expertise in Google Cloud Platform (Google Cloud Platform) and Vertex AI. You ll be responsible for building and maintaining scalable machine learning infrastructure, automating workflows, and enabling robust AI/ML deployments in production environments.

Key Responsibilities:
  • Develop, automate, and manage ML pipelines using Vertex AI Pipelines, Kubeflow, and Cloud Composer

  • Deploy and monitor models in production using Vertex AI and CI/CD workflows (Cloud Build, GitHub Actions, etc.)

  • Work closely with ML engineers and data scientists to productionize models and manage model versioning, retraining, and rollback strategies

  • Manage infrastructure-as-code using Terraform, Deployment Manager, or similar tools

  • Implement observability and monitoring (logging, metrics, alerts) using Cloud Monitoring, Prometheus, or Grafana

  • Ensure security, governance, and compliance of ML workflows within the Google Cloud Platform ecosystem

  • Optimize cost, performance, and scalability of ML systems in production

Required Skills:
  • 5+ years in DevOps/MLOps or Cloud ML Engineering, with recent Google Cloud Platform production experience

  • Strong hands-on experience with Vertex AI, Cloud Functions, BigQuery, and GCS

  • Proficiency with tools like TFX, Kubeflow, Docker, and Kubernetes (GKE preferred)

  • Expertise in CI/CD, GitOps, and workflow orchestration

  • Programming skills in Python (ML workflows) and Bash/Terraform (infra scripting)

  • Solid understanding of model lifecycle, pipeline automation, and ML monitoring

  • Bachelor's or Master's in Computer Science, Data Engineering, or related field

Nice to Have:
  • Google Cloud Platform Professional Machine Learning Engineer or DevOps Engineer certification

  • Familiarity with LLMs, RAG, or Vertex AI Search & Conversation

  • Experience with multi-region deployments or hybrid cloud setups

  • Exposure to Data Governance and Responsible AI practices