Negotiable
Undetermined
Remote
Remote
Summary: The Lead TechOps & SRE Engineer (AWS Cloud) is a strategic and hands-on role focused on leading cloud infrastructure, DevOps practices, and operational excellence initiatives. This position requires deep expertise in AWS and involves designing scalable architectures, improving automation, and ensuring system reliability. The engineer will also mentor a team and collaborate across various departments to contribute to the long-term cloud strategy. The role is fully remote and contract to hire.
Key Responsibilities:
- Architect and manage secure, scalable, and highly available infrastructure on AWS.
- Design multi-account AWS environments using AWS Organizations.
- Implement VPC architecture, IAM policies, networking, and security best practices.
- Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.
- Optimize AWS cost management and resource utilization.
- Implement Site Reliability Engineering (SRE) best practices.
- Define SLIs, SLOs, and error budgets.
- Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).
- Lead incident response, root cause analysis (RCA), and postmortems.
- Ensure 24/7 uptime and operational resilience.
- Implement IAM best practices and least-privilege access controls.
- Manage secrets and key management (AWS KMS, Secrets Manager).
- Conduct vulnerability management and patching.
- Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).
- Lead disaster recovery planning and backup strategies.
- Lead and mentor a team of DevOps/TechOps/ engineers.
- Establish operational KPIs and performance benchmarks.
- Manage on-call rotations and escalation processes.
- Collaborate with Engineering, Product, Security, and Data teams.
- Contribute to long-term infrastructure strategy and cloud roadmap.
Key Skills:
- Bachelor's degree in Computer Science, Engineering, or equivalent experience.
- 10+ years in DevOps, Cloud Engineering, or Infrastructure roles.
- 5+ years leading SRE technical teams.
- Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).
- Deep knowledge of networking, Linux systems, and distributed systems.
- Experience with Infrastructure-as-Code (Terraform or CloudFormation).
- Strong scripting skills (Python, Bash, or similar).
- Experience with containerization (Docker) and Kubernetes (EKS preferred).
Salary (Rate): £66,000 yearly
City: undetermined
Country: undetermined
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Location: 100% REMOTE
Employment mode: Contract to hire
Department: Technology / Engineering
We are seeking a highly experienced Lead TechOps & SRE Engineer with deep expertise in Cloud to lead our cloud infrastructure, DevOps practices, reliability engineering, and operational excellence initiatives. This role is both strategic and hands-on responsible for designing scalable architectures, improving automation, ensuring system reliability, and leading the TechOps team.
Key Responsibilities:
- Architect and manage secure, scalable, and highly available infrastructure on AWS.
- Design multi-account AWS environments using AWS Organizations.
- Implement VPC architecture, IAM policies, networking, and security best practices.
- Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.
- Optimize AWS cost management and resource utilization.
Site Reliability & Production Operations:
- Implement Site Reliability Engineering (SRE) best practices.
- Define SLIs, SLOs, and error budgets.
- Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).
- Lead incident response, root cause analysis (RCA), and postmortems.
- Ensure 24/7 uptime and operational resilience.
Security & Compliance:
- Implement IAM best practices and least-privilege access controls.
- Manage secrets and key management (AWS KMS, Secrets Manager).
- Conduct vulnerability management and patching.
- Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).
- Lead disaster recovery planning and backup strategies.
Leadership & Strategy:
- Lead and mentor a team of DevOps/TechOps/ engineers.
- Establish operational KPIs and performance benchmarks.
- Manage on-call rotations and escalation processes.
- Collaborate with Engineering, Product, Security, and Data teams.
- Contribute to long-term infrastructure strategy and cloud roadmap.
- Bachelor s degree in Computer Science, Engineering, or equivalent experience.
- 10+ years in DevOps, Cloud Engineering, or Infrastructure roles.
- 5+ years leading SRE technical teams.
- Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).
- Deep knowledge of networking, Linux systems, and distributed systems.
- Experience with Infrastructure-as-Code (Terraform or CloudFormation).
- Strong scripting skills (Python, Bash, or similar).
- Experience with containerization (Docker) and Kubernetes (EKS preferred).
- Strong architectural thinking
- Hands-on technical leadership
- Crisis and incident management
- Strategic planning and execution
- Excellent cross-functional communication