Negotiable
Outside
Remote
London Area, United Kingdom
Summary: The role of DevOps/Site Reliability Engineer involves managing cloud infrastructure with a focus on automation, observability, and system performance. The engineer will be responsible for designing scalable systems, implementing monitoring tools, and fostering a culture of reliability engineering. This position is remote and offers a contract length of 12 months. The ideal candidate will have extensive experience in DevOps or SRE roles, particularly in tech companies focused on product and security.
Key Responsibilities:
- Design, build and maintain scalable infrastructure using Infrastructure as Code (IaC)
- Implement observability and monitoring systems using tools like Grafana
- Support and improve incident response and escalation processes
- Drive automation across the development and deployment pipelines
- Work closely with AI and platform engineering teams to ensure system reliability and security
- Develop and scale systems to support both B2C and B2B product lines
- Champion DevOps best practices and SRE principles
Key Skills:
- Extensive experience working in a DevOps or SRE role, preferably within product-led or security-focused tech companies
- Expert-level knowledge of AWS and managing cloud services at scale
- Strong experience with IaC tools such as Terraform or CloudFormation
- Proven track record setting up observability, monitoring and alerting systems – ideally with Grafana or similar
- Incident response and systems reliability experience in high-scale environments
- Proficiency in building automation tools and optimising CI/CD pipelines
- Security-focused mindset with knowledge of platform hardening and secure provisioning
- Excellent communication and collaboration skills across engineering functions
Salary (Rate): undetermined
City: London Area
Country: United Kingdom
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Location : UK Remote
Rate : £500
Outside IR35
Contract Length : 12 Months
Start Date: Immediate
We are looking for a skilled DevOps/Site Reliability Engineer with a strong background in cloud infrastructure and a passion for automation, observability, and high-performance systems. You'll be instrumental in provisioning and securing scalable systems, ensuring uptime and performance, and driving a culture of reliability engineering across the business.
Key responsibilities:
- Design, build and maintain scalable infrastructure using Infrastructure as Code (IaC)
- Implement observability and monitoring systems using tools like Grafana
- Support and improve incident response and escalation processes
- Drive automation across the development and deployment pipelines
- Work closely with AI and platform engineering teams to ensure system reliability and security
- Develop and scale systems to support both B2C and B2B product lines
- Champion DevOps best practices and SRE principles
Required skills and experience:
- Extensive experience working in a DevOps or SRE role, preferably within product-led or security-focused tech companies
- Expert-level knowledge of AWS and managing cloud services at scale
- Strong experience with IaC tools such as Terraform or CloudFormation
- Proven track record setting up observability, monitoring and alerting systems – ideally with Grafana or similar
- Incident response and systems reliability experience in high-scale environments
- Proficiency in building automation tools and optimising CI/CD pipelines
- Security-focused mindset with knowledge of platform hardening and secure provisioning
- Excellent communication and collaboration skills across engineering functions
This is a unique opportunity to shape DevOps practices within a mission-driven, AI-led organisation impacting mental health at scale. Apply now to be considered.