Negotiable
Undetermined
Hybrid
Bristol, Leeds, or Halifax (Hybrid 3 days onsite per week)
Summary: The role of Google Cloud SRE focuses on leveraging hands-on experience with Google Cloud Platform and Site Reliability Engineering to enhance cloud infrastructure and reliability. The position requires expertise in various tools and practices, including Dynatrace, Terraform, and CI/CD processes. The candidate will be responsible for incident management, troubleshooting, and implementing automation to reduce toil. This role is hybrid, requiring three days on-site each week in either Bristol, Leeds, or Halifax.
Key Responsibilities:
- Utilize Google Cloud Platform (GCP) for hands-on engineering tasks.
- Implement Site Reliability Engineering (SRE) practices.
- Manage Dynatrace for instrumentation, dashboards, and SLO-based alerting.
- Develop Infrastructure as Code using Terraform.
- Oversee CI/CD processes with tools like Jenkins, Azure, and GitHub.
- Administer production Kubernetes clusters.
- Automate processes to reduce toil in DevOps.
- Write scripts in Python, Groovy, BASH, and PowerShell.
- Ensure cloud security, networking, and API management.
- Handle incident management and troubleshooting effectively.
Key Skills:
- Hands-on experience with Google Cloud Platform (GCP).
- Certifications in relevant technologies preferred.
- Proficiency in Site Reliability Engineering (SRE).
- Experience with Dynatrace for monitoring and alerting.
- Knowledge of Terraform for Infrastructure as Code.
- Familiarity with CI/CD tools like Jenkins, Azure, and GitHub.
- Experience in managing Kubernetes production clusters.
- Strong scripting skills in Python, Groovy, BASH, and PowerShell.
- Understanding of cloud security, networking, and APIs.
- Ability to manage incidents and troubleshoot effectively.
Salary (Rate): undetermined
City: undetermined
Country: undetermined
Working Arrangements: hybrid
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Core Technical Skills
Google Cloud Platform (GCP) - hands-on experience; certifications preferred
Site Reliability Engineering (SRE)
Dynatrace - instrumentation, dashboards, SLO-based alerting
Terraform - Infrastructure as Code (modular, maintainable)
CI/CD - Jenkins/Azure/DevOps/github
Kubernetes - production cluster administration
DevOps - automation, toil reduction
Scripting - Python, Groovy, BASH, PowerShell
Cloud Security, Networking & APIs
Incident Management & Troubleshooting
Job Title Keywords
Site Reliability Engineer (SRE), Cloud SRE, Google Cloud SRE
DevOps Engineer with Observability
Cloud Platform Engineer, Observability Engineer