Negotiable
Outside
Remote
USA
Summary: The role of Google Cloud Platform Vertex AI Sr. Data Engineer involves leading the integration of machine learning models into critical business applications using Google Cloud Platform's Vertex AI. The position requires collaboration with various stakeholders to ensure effective model deployment and performance monitoring. Key responsibilities include designing scalable model inference pipelines and automating model lifecycle management. The ideal candidate will have strong experience with MLOps and model operationalization techniques.
Key Responsibilities:
- Lead the integration of machine learning models into business-critical applications using Google Cloud Platform Vertex AI.
- Collaborate with data engineers, data scientists, software engineers, and product owners to ensure seamless model deployment and performance in production environments.
- Design and implement scalable, resilient, and secure model inference pipelines using Vertex AI, Vertex Pipelines, and related services.
- Enable continuous delivery and monitoring of models via Vertex AI Model Registry, Prediction Endpoints, and Model Monitoring features.
- Optimize model serving performance, cost, and throughput under high-load, real-time, and batch scenarios.
- Automate model lifecycle management including CI/CD pipelines, retraining, versioning, rollback, and shadow testing.
- Participate in architecture reviews and advocate best practices in ML model orchestration, resource tuning, and observability.
Key Skills:
- 2 years - Strong experience in model integration and deployment using Google Cloud Platform Vertex AI, especially around Vertex Pipelines, Endpoints, Model Monitoring, and Feature Store.
- Expertise in scaling ML models in production, including load balancing, latency optimization, A/B testing, and automated retraining pipelines.
- Proficiency in MLOps and model operationalization techniques, with knowledge of infrastructure-as-code and containerized environments.
- Required: Google Cloud Platform Vertex AI Suite (including Pipelines, Feature Store, Model Monitoring), Python (with emphasis on integration frameworks and automation), Git, Docker, Poetry, Terraform or Deployment Manager, BigQuery, Dataflow, and Cloud Functions, Monitoring tools (Stackdriver, Prometheus, etc.).
- Preferred: Experience with MLOps tools such as Kubeflow, MLFlow, or TFX; Familiarity with enterprise monitoring tools like Prometheus, Grafana, or Stackdriver for ML observability; Exposure to hybrid or federated model deployment architectures.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Disqualifiers
Resumes more than 3-4 pages in length.
Generic resumes without clearly defined accomplishments or project impact.
Missing valid LinkedIn profile.
Day to Day Responsibilities
Lead the integration of machine learning models into business-critical applications using
Google Cloud Platform Vertex AI.
Collaborate with data engineers, data scientists, software engineers, and product owners to
ensure seamless model deployment and performance in production environments.
Design and implement scalable, resilient, and secure model inference pipelines using Vertex
AI, Vertex Pipelines, and related services.
Enable continuous delivery and monitoring of models via Vertex AI Model Registry, Prediction
Endpoints, and Model Monitoring features.
Optimize model serving performance, cost, and throughput under high-load, real-time, and
batch scenarios.
Automate model lifecycle management including CI/CD pipelines, retraining, versioning,
rollback, and shadow testing.
Participate in architecture reviews and advocate best practices in ML model orchestration,
resource tuning, and observability.
Required Skills
2 years - Strong experience in model integration and deployment using Google Cloud Platform Vertex AI, especially
around Vertex Pipelines, Endpoints, Model Monitoring, and Feature Store.
Expertise in scaling ML models in production, including load balancing, latency optimization,
A/B testing, and automated retraining pipelines.
Proficiency in MLOps and model operationalization techniques, with knowledge of
infrastructure-as-code and containerized environments.
Software Skills
Required:
Google Cloud Platform Vertex AI Suite (including Pipelines, Feature Store, Model Monitoring)
Python (with emphasis on integration frameworks and automation)
Git, Docker, Poetry, Terraform or Deployment Manager
BigQuery, Dataflow, and Cloud Functions
Monitoring tools (Stackdriver, Prometheus, etc.)
Preferred Skills
Experience with MLOps tools such as Kubeflow, MLFlow, or TFX.
Familiarity with enterprise monitoring tools like Prometheus, Grafana, or Stackdriver for ML observability.
Exposure to hybrid or federated model deployment architectures.