Negotiable
Outside
Remote
USA
Summary: The role of Sr AI/ML Cloud Engineer involves building and managing cloud infrastructure for AI/ML workloads, automating model deployment and monitoring, and optimizing performance and costs. The position requires extensive experience in cloud platforms and data engineering, with a focus on security and compliance. Candidates should have a strong background in AI/ML technologies and tools. This is a remote position based in the USA.
Key Responsibilities:
- Build and manage compute, storage, and networking for AI/ML workloads using platforms like AWS, Azure, or Google Cloud.
- Automate model deployment, monitoring, and retraining using tools like Kube Flow, MLflow, or SageMaker Pipelines.
- Set up data lakes, ETL pipelines, and real-time data processing systems for AI models.
- Optimize GPU/TPU usage, autoscaling, and cloud resource management for high-efficiency AI systems.
- Implement IAM, data encryption, and compliance policies for AI data and models.
Key Skills:
- 10-15 years of experience in cloud infrastructure and AI/ML technologies.
- Proficiency in AWS, Azure, or Google Cloud platforms.
- Experience with MLOps tools like Kube Flow, MLflow, or SageMaker Pipelines.
- Strong programming skills in Python and familiarity with frameworks like TensorFlow, PyTorch, and scikit-learn.
- Knowledge of data engineering practices, including ETL and real-time data processing.
- Experience in security and compliance for AI systems.
- Familiarity with Google Cloud specific skill sets, including Enterprise Gemini AI experience is a plus.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Role 1: AI/ML Cloud Engineer (3 Positions)
Experience range: 10-15 years
Location: US (Remote)
Job Description:
Cloud Infrastructure: Build and manage compute, storage, and networking for AI/ML workloads using platforms like AWS, Azure, or Google Cloud.
MLOps/AIOps: Automate model deployment, monitoring, and retraining using tools like Kube Flow, MLflow, or SageMaker Pipelines. AWS (SageMaker, ECS, S3), Google Cloud (Vertex AI, BigQuery), Azure (AI Studio, Synapse) Python, TensorFlow, PyTorch, scikit-learn, Cursor Coding
Data Engineering: Set up data lakes, ETL pipelines, and real-time data processing systems for AI models.
Cost & Performance Optimization: Optimize GPU/TPU usage, autoscaling, and cloud resource management for high-efficiency AI systems.
Security & Compliance: Implement IAM, data encryption, and compliance policies for AI data and models.
Good to have primary Google cloud specific skill sets (including Enterprise Gemini AI experience)
Taras Technology, LLC is an EEO/AA Employer: women, minorities, the disabled and veterans are encouraged to apply