Site Reliability Engineering Manager

Posted 1 week ago by TECEZE

Apply

£330 Per day

Outside

Hybrid

London, England, United Kingdom

Apply

Summary: The Site Reliability Engineer - Manager role involves leading the design, development, and delivery of scalable and reliable infrastructure and services for a top mobile industry company. The position requires collaboration with cross-functional teams to enhance observability, automate operations, and build robust systems for cloud-native applications. The role is hybrid remote, based in London, and focuses on maintaining critical services while optimizing performance and deployment pipelines. Candidates should have extensive experience in site reliability engineering and strong technical skills in relevant technologies.

Key Responsibilities:

Maintain and scale critical services and infrastructure.
Identify performance bottlenecks and work closely with product engineers to optimize applications.
Administer, scale, and troubleshoot clusters in GKE, EKS, or other Kubernetes environments.
Design and maintain scalable infrastructure using Terraform and automate deployments across public, private, or hybrid clouds (mainly AWS).
Build and improve robust CI/CD pipelines to support fast and safe deployment cycles.
Implement code-based instrumentation and telemetry.
Ensure systems are observable with tools for logging, metrics, and alerting.
Write tooling and automation scripts in Python, Go, or Rust to reduce toil and manual intervention.
Manage and optimise storage services like Amazon S3 or Google Cloud Storage (GCS).
Resolve complex networking issues in multi-cloud environments.

Key Skills:

5+ years of hands-on experience as a Site Reliability Engineer.
Proven expertise in Kubernetes (GKE/EKS).
Strong proficiency in Python, Go, or Rust.
Solid experience with AWS and Infrastructure as Code using Terraform.
Deep understanding of Linux internals, standard networking protocols, and distributed systems architecture.
Hands-on experience with automation and performance optimisation.
Strong knowledge of SRE principles and methodologies.
Experience with observability tools and telemetry systems.
Exposure to Google Cloud Platform (GCP).
Familiarity with hybrid or multi-cloud architecture.
Experience with service meshes or edge proxies (e.g., Envoy, Istio).
Working knowledge of container security best practices.

Salary (Rate): £330 daily

City: London

Country: United Kingdom

Working Arrangements: hybrid

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Job Title: Site Reliability Engineer - Manager

Location: Hybrid Remote – London EC2M

Contract (12 months)

Rate: Outside IR35 - £300 to £330 Per Day

About the Role: We are partnering with one of the top companies in the mobile industry to hire a Site Reliability Engineer (SRE) Manager. In this role, you will collaborate with cross-functional teams to drive the design, development, and delivery of high-performing, scalable, and reliable infrastructure and services. You’ll be responsible for building robust systems, automating operations, and enhancing observability and deployment pipelines for modern cloud-native applications.

Key Responsibilities:

System Reliability & Performance: Maintain and scale critical services and infrastructure. Identify performance bottlenecks and work closely with product engineers to optimize applications.
Kubernetes Operations: Administer, scale, and troubleshoot clusters in GKE, EKS, or other Kubernetes environments.
Infrastructure as Code (IaC): Design and maintain scalable infrastructure using Terraform and automate deployments across public, private, or hybrid clouds (mainly AWS).
CI/CD Pipeline Enhancement: Build and improve robust CI/CD pipelines to support fast and safe deployment cycles.
Observability & Monitoring: Implement code-based instrumentation and telemetry. Ensure systems are observable with tools for logging, metrics, and alerting.
Automation & Scripting: Write tooling and automation scripts in Python, Go, or Rust to reduce toil and manual intervention.
Storage & Networking: Manage and optimise storage services like Amazon S3 or Google Cloud Storage (GCS). Resolve complex networking issues in multi-cloud environments.

Essential Requirements: 5+ years of hands-on experience as a Site Reliability Engineer. Proven expertise in Kubernetes (GKE/EKS). Strong proficiency in Python, Go, or Rust. Solid experience with AWS and Infrastructure as Code using Terraform. Deep understanding of Linux internals, standard networking protocols, and distributed systems architecture. Hands-on experience with automation and performance optimisation. Strong knowledge of SRE principles and methodologies. Experience with observability tools and telemetry systems. Exposure to Google Cloud Platform (GCP). Familiarity with hybrid or multi-cloud architecture. Experience with service meshes or edge proxies (e.g., Envoy, Istio). Working knowledge of container security best practices.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)