Senior Applied ML Engineer- Remote

Posted Today by Calance

Apply

Negotiable

Undetermined

Remote

Apply

Artificial Intelligence Automation Azure Databricks Balancing (Ledger/Billing) Cloud Computing Cloud Technology Computer Vision Continuous Integration and Continuous Delivery Databricks Data Ingestion Deep Learning Embedding Full Stack Development Google Cloud Google Cloud Platform (GCP) Indexing Low Latency Machine Learning Natural Language Processing NLTK (NLP Analysis) Optical Character Recognition (OCR) Python (Programming Language) Quantisation Scripting Semantic Search Workflows

Summary: The Senior Applied ML Engineer will design, build, and deploy advanced machine learning solutions, focusing on deep learning models for Natural Language Processing and Computer Vision. This role emphasizes developing scalable ML systems that yield measurable business outcomes and enhance organizational value. The engineer will also innovate in ML operations and evaluation, ensuring high-performance models in production environments. A strong background in machine learning engineering and LLM systems is essential for success in this position.

Key Responsibilities:

Design, build, fine-tune, and deploy state-of-the-art machine learning and large language models at scale.
Develop end-to-end ML and LLM pipelines, covering data ingestion, scripting, automated workflows, model training, evaluation, and post-processing.
Build and operationalize LLM fine-tuning pipelines using various model adaptation techniques.
Design and experiment with novel LLM architectures, balancing model size and computational efficiency.
Optimize LLMs for production deployment through model quantization, compression, and teacher-student architectures.
Architect and deploy Retrieval-Augmented Generation (RAG) systems using vector databases and semantic search.
Innovate in ML operations and evaluation, including automated ground-truth generation and continuous post-evaluation pipelines.
Design and implement CI/CD pipelines for machine learning systems to ensure high availability and rapid iteration.

Key Skills:

5+ years of experience in machine learning engineering with a track record of deploying ML and NLP/LLM systems.
Strong hands-on experience building full-stack ML systems from data ingestion to monitoring.
Deep expertise in LLM fine-tuning and adaptation techniques.
Practical experience designing and optimizing LLM architectures.
Proficiency in model inference optimization techniques.
Solid understanding of RAG architectures and retrieval workflows.
Experience with modern LLM orchestration and RAG frameworks.
Strong background in ML evaluation and MLOps.
Proficiency in Python and ML/AI development frameworks.

Salary (Rate): £100/hr

City: undetermined

Country: undetermined

Working Arrangements: remote

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

What We Do
You'll be responsible for designing, building, and deploying applied machine learning solutions including deep learning transformer-based models for Natural Language Processing and Computer Vision, as well as traditional shallow learning models. The role focuses on developing scalable ML systems that deliver measurable business outcomes and drive value across the organization.

WHAT YOU'LL DO:
Design, build, fine-tune, and deploy state-of-the-art machine learning and large language models at scale, supporting millions of daily predictions with a strong focus on accuracy, latency, compute efficiency, and cost optimization.
Develop end-to-end ML and LLM pipelines, covering data ingestion, scripting, automated workflows for OCR, model training, evaluation, and post-processing in production environments.
Build and operationalize LLM fine-tuning pipelines, applying a range of model adaptation techniques including full fine-tuning, LoRA (Low-Rank Adaptation), prompt-based methods, and Direct Preference Optimization (DPO).
Design and experiment with novel LLM architectures, balancing model size, computational efficiency, memory constraints, and deployment requirements.
Optimize LLMs for production deployment through model quantization, compression, and teacher student architectures, enabling efficient inference in resource-constrained environments.
Architect and deploy Retrieval-Augmented Generation (RAG) systems, leveraging vector databases, embedding services, semantic search, document chunking, indexing, and retrieval mechanisms using frameworks such as LangChain, LlamaIndex, and commercial RAG platforms within Google Cloud Platform and Databricks.
Innovate in ML operations and evaluation, including automated ground-truth generation, continuous post-evaluation pipelines, and iterative feedback loops to systematically improve model performance over time.
Design and implement CI/CD pipelines for machine learning systems, ensuring high availability, reliability, low latency, and rapid iteration from experimentation to production.

WHAT YOU LL BRING
5+ years of experience in machine learning engineering, with a proven track record of deploying and operating ML and NLP/LLM systems in production at scale.
Strong hands-on experience building full-stack ML systems, from data ingestion and automation to training, evaluation, deployment, and monitoring.
Deep expertise in LLM fine-tuning and adaptation techniques, including full fine-tuning, LoRA, prompt-based optimization, and preference-based methods such as DPO.
Practical experience designing and optimizing LLM architectures, with an emphasis on compute efficiency, memory usage, and real-world deployment constraints.
Demonstrated proficiency in model inference optimization, including quantization, compression, and distillation techniques for high-throughput, cost-efficient production systems.
Solid understanding and hands-on experience with RAG architectures, vector stores, embeddings, semantic search, chunking strategies, and retrieval workflows integrated with large language models.
Experience using modern LLM orchestration and RAG frameworks such as LangChain, LlamaIndex, and managed AI platforms within cloud ecosystems like Google Cloud Platform and Databricks.
Strong background in ML evaluation and MLOps, including automated evaluation pipelines, CI/CD for ML, and continuous improvement of deployed models.
Proficiency in Python and ML/AI development frameworks, with the ability to work in fast-paced, experimental environments and production systems simultaneously.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)