ML Engineer with LLM Agentic AI

ML Engineer with LLM Agentic AI

Posted 2 days ago by 1762335812

Negotiable
Outside
Remote
USA

Summary: The role of ML Engineer with LLM + Agentic AI involves designing, training, fine-tuning, and deploying machine learning models, particularly focusing on large language models (LLMs) and agentic AI systems. The position requires collaboration with data engineering teams to build and maintain data pipelines and implement advanced AI techniques. Candidates should possess strong programming skills and experience with various ML frameworks and LLM ecosystems. This is a remote position based in the USA, with a contract duration of 6-12+ months.

Key Responsibilities:

  • Design, train, fine-tune, and deploy ML/LLM models for production.
  • Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases.
  • Prototype and optimize multi-agent workflows using LangChain, LangGraph, and MCP.
  • Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.
  • Integrate memory, evidence packs, and explainability modules into agentic pipelines.
  • Collaborate with Data Engineering to build and maintain real-time and batch data pipelines supporting ML/LLM workloads.
  • Conduct feature engineering, preprocessing, and embedding generation for structured and unstructured data.
  • Implement model monitoring, drift detection, and retraining pipelines.
  • Utilize cloud ML platforms such as AWS SageMaker and Databricks ML for experimentation and scaling.

Key Skills:

  • Experience designing, training, fine-tuning, and deploying LLM/ML models for production.
  • Hands-on experience with RAG (Retrieval-Augmented Generation) pipelines using vector databases.
  • Proficiency in Python and modern ML frameworks such as PyTorch, TensorFlow, Scikit-learn, and Hugging Face Transformers.
  • Experience with multi-agent frameworks such as LangChain, LangGraph, or MCP.
  • Experience working with LLM ecosystems (OpenAI GPT, Anthropic Claude, Google Gemini, Meta LLaMA).
  • Strong understanding of the ML lifecycle including data preparation, training, evaluation, deployment, and monitoring.
  • Experience building and maintaining real-time/batch data pipelines and ML infrastructure (AWS SageMaker, Databricks ML).
  • Experience implementing AI safety, explainability, and guardrail mechanisms for responsible AI development.
  • Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or a related field.
  • 3+ years of experience building and deploying ML systems.
  • Strong programming skills in Python, with experience in PyTorch, TensorFlow, Scikit-learn, and Hugging Face Transformers.
  • Hands-on experience with LLMs/SLMs (fine-tuning, prompt design, inference optimization).
  • Familiarity with vector databases, embeddings, and RAG pipelines.
  • Proficiency in handling structured and unstructured data at scale.
  • Working knowledge of SQL and distributed frameworks such as Spark or Ray.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Role: ML Engineer with LLM + Agentic AI

Duration: 6-12+ Months Contract

Location: CA - Remote

Must Have Skills

  • Skill 1 Experience designing, training, fine-tuning, and deploying LLM/ML models for production.
  • Skill 2 Hands-on experience with RAG (Retrieval-Augmented Generation) pipelines using vector databases.
  • Skill 3 Proficiency in Python and modern ML frameworks such as PyTorch, TensorFlow, Scikit-learn, and Hugging Face Transformers.
  • Skill 4 Experience with multi-agent frameworks such as LangChain, LangGraph, or MCP.
  • Skill 5 Experience working with LLM ecosystems (OpenAI GPT, Anthropic Claude, Google Gemini, Meta LLaMA).
  • Skill 6 Strong understanding of the ML lifecycle including data preparation, training, evaluation, deployment, and monitoring.
  • Skill 7 Experience building and maintaining real-time/batch data pipelines and ML infrastructure (AWS SageMaker, Databricks ML).
  • Skill 8 Experience implementing AI safety, explainability, and guardrail mechanisms for responsible AI development.

Domain Experience (If any)

Experience in enterprise spend management, risk controls, or financial data analytics preferred.

Key Responsibilities

Core ML/LLM Engineering

  • Design, train, fine-tune, and deploy ML/LLM models for production.
  • Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases.
  • Prototype and optimize multi-agent workflows using LangChain, LangGraph, and MCP.
  • Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.
  • Integrate memory, evidence packs, and explainability modules into agentic pipelines.
  • Work with multiple LLM ecosystems, including:

o OpenAI GPT (GPT-4, GPT-4o, fine-tuned GPTs)

o Anthropic Claude (Claude 2/3 for reasoning and safety-aligned workflows)

o Google Gemini (multimodal reasoning, advanced RAG integration)

o Meta LLaMA (fine-tuned/custom models for domain-specific tasks)

Data & Infrastructure

  • Collaborate with Data Engineering to build and maintain real-time and batch data pipelines supporting ML/LLM workloads.
  • Conduct feature engineering, preprocessing, and embedding generation for structured and unstructured data.
  • Implement model monitoring, drift detection, and retraining pipelines.
  • Utilize cloud ML platforms such as AWS SageMaker and Databricks ML for experimentation and scaling.

Education, Experience, and Skills Required

  • Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or a related field.
  • 3+ years of experience building and deploying ML systems.
  • Strong programming skills in Python, with experience in PyTorch, TensorFlow, Scikit-learn, and Hugging Face Transformers.
  • Hands-on experience with LLMs/SLMs (fine-tuning, prompt design, inference optimization).
  • Demonstrated expertise in at least two of the following:

o OpenAI GPT (chat, assistants, fine-tuning)

o Anthropic Claude (safety-first reasoning, summarization)

o Google Gemini (multimodal reasoning, enterprise APIs)

o Meta LLaMA (open-source fine-tuned models)

  • Familiarity with vector databases, embeddings, and RAG pipelines.
  • Proficiency in handling structured and unstructured data at scale.
  • Working knowledge of SQL and distributed frameworks such as Spark or Ray.
  • Strong understanding of the ML lifecycle from data prep and training to deployment and monitoring.