Site Reliability Engineer - On Prem to Azure Cloud Platform

Site Reliability Engineer - On Prem to Azure Cloud Platform

Posted Today by Square One Resources

£605 Per day
Inside
Remote
England, UK

Summary: The Site Reliability Engineer (SRE) role focuses on leading the migration of on-premises HPC solutions to the Azure cloud platform, enhancing infrastructure reliability and performance through automation and software engineering practices. The position requires a strong understanding of cloud-native architectures and collaboration with development and operations teams. The ideal candidate will possess extensive experience in site reliability engineering and a solid background in Linux environments. This is a fully remote position based in the UK for a 12-month contract starting in January 2026.

Key Responsibilities:

  • Work with existing solutions already in place in the US to redefine, implement, and maintain scalable, reliable cloud infrastructure across Azure and AWS for the UK business as a similar but separate entity.
  • Develop automation scripts and tools to streamline operational tasks such as log analysis, environment testing, and incident response.
  • Collaborate with development and operations teams to ensure seamless deployment and performance of applications and services.
  • Monitor system performance and availability, proactively identifying and resolving issues.
  • Apply software engineering principles to infrastructure management, improving efficiency and reducing manual effort.
  • Deliver value by monitoring spending, optimizing resource usage, right-sizing and automation, and implement governance through tagging strategies and budget alerts.
  • Document the solution and deliver knowledge transfer and training to existing team members.

Key Skills:

  • Strong understanding of cloud-native architectures and services in Azure including AKS/EKS and its automation.
  • Experience in a Linux environment.
  • Experience with infrastructure-as-code tools (e.g., Terraform).
  • Familiarity with CI/CD pipelines, containerization (Docker, Kubernetes), and monitoring tools.
  • Knowledge of data processing and configuration design.
  • Experience with IT infrastructure and monitoring systems.

Salary (Rate): £605 per day

City: undetermined

Country: UK

Working Arrangements: remote

IR35 Status: inside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Job Title: Site Reliability Engineer - On Prem to Azure Cloud Platform
Location: Fully remote from anywhere in the UK
Salary/Rate: Up to £605 per day (Inside IR35)
Start Date: Jan 2026
Job Type: 12-month contract

Role overview:
We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in performing On-Prem migrations to the Azure cloud platform on an enterprise scale.

This position is responsible for taking a lead role in migrating an existing on-prem HPC solution to the Azure Cloud, enhancing the reliability, scalability, and performance of that cloud infrastructure through automation, software engineering practices, and proactive system management. The ideal candidate will bridge the gap between development and operations, applying a software engineering mindset to IT operations and infrastructure.

Required Skills/Experience
The ideal candidate will have the following:

  • Strong understanding of cloud-native architectures and services in Azure including AKS/EKS and its automation.
  • Our clients application stack is Linux based so experience in a Linux environment is required.
  • Experience with infrastructure-as-code tools (eg, Terraform).
  • Familiarity with CI/CD pipelines, containerization (Docker, Kubernetes), and monitoring tools.
  • Knowledge of data processing and configuration design.
  • Experience with IT infrastructure and monitoring systems.

Job Responsibilities/Objectives:

  • Work with existing solutions already in place in the US to redefine, implement, and maintain scalable, reliable cloud infrastructure across Azure and AWS for the UK business as a similar but separate entity.
  • Develop automation scripts and tools to streamline operational tasks such as log analysis, environment testing, and incident response.
  • Collaborate with development and operations teams to ensure seamless deployment and performance of applications and services.
  • Monitor system performance and availability, proactively identifying and resolving issues.
  • Apply software engineering principles to infrastructure management, improving efficiency and reducing manual effort.
  • Deliver value by monitoring spending, optimizing resource usage, right-sizing and automation, and implement governance through tagging strategies and budget alerts.
  • Document the solution and deliver knowledge transfer and training to existing team members.

Education & Experience:

  • Bachelor's degree in Computer Science, Computer Engineering, Information Technology, or a related field.
  • Extensive experience in site reliability engineering, DevOps, or cloud infrastructure roles.

If you are interested in this opportunity, please apply now with your updated CV in Microsoft Word/PDF format.

Disclaimer
Notwithstanding any guidelines given to level of experience sought, we will consider candidates from outside this range if they can demonstrate the necessary competencies.
Square One is acting as both an employment agency and an employment business, and is an equal opportunities recruitment business. Square One embraces diversity and will treat everyone equally. Please see our website for our full diversity statement.