Llama Talent with AI Engineering - Remote

Llama Talent with AI Engineering - Remote

Posted 4 days ago by 1762932480

Negotiable
Outside
Remote
USA

Summary: The role involves designing, developing, and optimizing LLaMA-based and other open-source LLM solutions, with a focus on fine-tuning models for specific applications. Responsibilities include building scalable LLM pipelines and ensuring compliance with responsible AI principles. Collaboration with data scientists and ML engineers is essential for preparing large datasets and evaluating model performance. The position requires strong programming skills and experience with various AI and machine learning frameworks.

Key Responsibilities:

  • Design, develop, and optimize LLaMA-based and other open-source LLM solutions.
  • Fine-tune models for domain-specific applications using PyTorch, Transformers, or Hugging Face ecosystems.
  • Build and deploy scalable LLM pipelines integrating retrieval-augmented generation (RAG), vector databases, and prompt orchestration tools.
  • Evaluate and benchmark model performance, accuracy, latency, and cost efficiency.
  • Collaborate with data scientists and ML engineers to prepare and clean large text datasets.
  • Implement responsible AI principles ensuring fairness, transparency, and compliance.

Key Skills:

  • Proven experience working with Meta s LLaMA models (LLaMA 2 / 3 / Llama Talent platform).
  • Strong programming in Python, with frameworks like PyTorch, LangChain, and Hugging Face Transformers.
  • Deep understanding of LLM architecture, tokenization, embeddings, and fine-tuning techniques.
  • Experience deploying models via REST APIs, Docker, or Kubernetes in production environments.
  • Familiarity with vector databases (e.g., FAISS, Pinecone, Milvus) and prompt pipelines.
  • Solid knowledge of MLOps tools (Weights & Biases, MLflow, or Vertex AI).

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Key Responsibilities:

  • Design, develop, and optimize LLaMA-based and other open-source LLM solutions.
  • Fine-tune models for domain-specific applications using PyTorch, Transformers, or Hugging Face ecosystems.
  • Build and deploy scalable LLM pipelines integrating retrieval-augmented generation (RAG), vector databases, and prompt orchestration tools.
  • Evaluate and benchmark model performance, accuracy, latency, and cost efficiency.
  • Collaborate with data scientists and ML engineers to prepare and clean large text datasets.
  • Implement responsible AI principles ensuring fairness, transparency, and compliance.

Required Skills & Experience:

  • Proven experience working with Meta s LLaMA models (LLaMA 2 / 3 / Llama Talent platform).
  • Strong programming in Python, with frameworks like PyTorch, LangChain, and Hugging Face Transformers.
  • Deep understanding of LLM architecture, tokenization, embeddings, and fine-tuning techniques.
  • Experience deploying models via REST APIs, Docker, or Kubernetes in production environments.
  • Familiarity with vector databases (e.g., FAISS, Pinecone, Milvus) and prompt pipelines.
  • Solid knowledge of MLOps tools (Weights & Biases, MLflow, or Vertex AI).

Preferred Qualifications:

  • Experience with RAG-based or agentic AI systems (LangGraph, CrewAI, etc.)
  • Background in NLP research or AI model evaluation.
  • Understanding of Meta AI ecosystem tools and deployment practices.