Negotiable
Inside
Hybrid
London Area, United Kingdom
Summary: The Data & Analytics Lead Site Reliability Engineer (SRE) is a senior leadership role focused on enhancing reliability and application stability within Data & Analytics platforms in a CDAO environment. The position requires a seasoned engineer to drive reliability engineering strategies and collaborate with various teams to ensure scalable and observable systems. The role emphasizes the importance of automation and operational excellence in large-scale enterprise ecosystems. This position is hybrid, based in London or Glasgow, for an initial duration of 6-9 months.
Key Responsibilities:
- Champion site reliability best practices across Data & Analytics platforms
- Improve application stability, availability, and performance
- Lead initiatives in observability, monitoring, and incident management
- Identify and resolve complex technical bottlenecks
- Drive automation through CI/CD pipelines and containerized environments
- Influence engineering standards and operational excellence across teams
Key Skills:
- 10+ years of Site Reliability Engineering experience
- Deep expertise in reliability engineering principles
- Strong experience in observability and monitoring frameworks
- Hands-on experience with CI/CD pipelines and container technologies
- Proficiency in one or more programming languages
- Proven track record operating within large-scale enterprise environments
Salary (Rate): undetermined
City: London
Country: United Kingdom
Working Arrangements: hybrid
IR35 Status: inside IR35
Seniority Level: Senior
Industry: IT
Data & Analytics Lead Site Reliability Engineer (SRE)
Location: London or Glasgow, Hybrid
Duration – 6-9 months initially
Competitive Day Rate (Inside IR35)
We are seeking a seasoned Lead Site Reliability Engineer (SRE) to champion reliability, improve application stability, and eliminate technical bottlenecks across our Data & Analytics platforms within a CDAO environment. This is a senior leadership role for an engineer who thrives in complex, large-scale enterprise ecosystems and is passionate about building resilient, highly available systems.
Role Overview
As a Lead SRE, you will drive reliability engineering strategy and execution across mission-critical data platforms. You will partner with engineering, architecture, and operations teams to ensure systems are scalable, observable, and continuously improving.
Key Responsibilities
- Champion site reliability best practices across Data & Analytics platforms
- Improve application stability, availability, and performance
- Lead initiatives in observability, monitoring, and incident management
- Identify and resolve complex technical bottlenecks
- Drive automation through CI/CD pipelines and containerized environments
- Influence engineering standards and operational excellence across teams
Required Experience & Skills
- 10+ years of Site Reliability Engineering experience
- Deep expertise in reliability engineering principles
- Strong experience in observability and monitoring frameworks
- Hands-on experience with CI/CD pipelines and container technologies
- Proficiency in one or more programming languages
- Proven track record operating within large-scale enterprise environments