Sr AI/ML Cloud Engineer - 100% Remote

Sr AI/ML Cloud Engineer - 100% Remote

Posted 1 week ago by 1762323701

Negotiable
Outside
Remote
USA

Summary: The role of Sr AI/ML Cloud Engineer involves building and managing cloud infrastructure for AI/ML workloads, automating model deployment and monitoring, and optimizing performance and costs. The position requires extensive experience in cloud platforms and data engineering, with a focus on security and compliance. Candidates should have a strong background in AI/ML technologies and tools. This is a remote position based in the USA.

Key Responsibilities:

  • Build and manage compute, storage, and networking for AI/ML workloads using platforms like AWS, Azure, or Google Cloud.
  • Automate model deployment, monitoring, and retraining using tools like Kube Flow, MLflow, or SageMaker Pipelines.
  • Set up data lakes, ETL pipelines, and real-time data processing systems for AI models.
  • Optimize GPU/TPU usage, autoscaling, and cloud resource management for high-efficiency AI systems.
  • Implement IAM, data encryption, and compliance policies for AI data and models.

Key Skills:

  • 10-15 years of experience in cloud infrastructure and AI/ML technologies.
  • Proficiency in AWS, Azure, or Google Cloud platforms.
  • Experience with MLOps tools like Kube Flow, MLflow, or SageMaker Pipelines.
  • Strong programming skills in Python and familiarity with frameworks like TensorFlow, PyTorch, and scikit-learn.
  • Knowledge of data engineering practices, including ETL and real-time data processing.
  • Experience in security and compliance for AI systems.
  • Familiarity with Google Cloud specific skill sets, including Enterprise Gemini AI experience is a plus.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Role 1: AI/ML Cloud Engineer (3 Positions)

Experience range: 10-15 years

Location: US (Remote)

Job Description:

Cloud Infrastructure: Build and manage compute, storage, and networking for AI/ML workloads using platforms like AWS, Azure, or Google Cloud.

MLOps/AIOps: Automate model deployment, monitoring, and retraining using tools like Kube Flow, MLflow, or SageMaker Pipelines. AWS (SageMaker, ECS, S3), Google Cloud (Vertex AI, BigQuery), Azure (AI Studio, Synapse) Python, TensorFlow, PyTorch, scikit-learn, Cursor Coding

Data Engineering: Set up data lakes, ETL pipelines, and real-time data processing systems for AI models.

Cost & Performance Optimization: Optimize GPU/TPU usage, autoscaling, and cloud resource management for high-efficiency AI systems.

Security & Compliance: Implement IAM, data encryption, and compliance policies for AI data and models.

Good to have primary Google cloud specific skill sets (including Enterprise Gemini AI experience)

Taras Technology, LLC is an EEO/AA Employer: women, minorities, the disabled and veterans are encouraged to apply