Cloud Site Reliability Engineer - Azure AWS (34084)

Posted 2 days ago by 1754720189

Apply

Negotiable

Outside

Remote

USA

Apply

Summary: The Cloud Site Reliability Engineer role focuses on enhancing infrastructure through SRE best practices in AWS and Azure environments. The position involves managing critical services, improving observability, and fostering automation to elevate the developer experience. The engineer will also take ownership of IAM governance and promote operational excellence while collaborating with developers and researchers. This role is remote and classified as outside IR35.

Key Responsibilities:

Oversee the design and improvement of infrastructure using SRE best practices, including IaC, recovery automation, and systems that detect and resolve issues independently.
Manage and fine-tune critical services across both cloud and on-prem environments: Kubernetes clusters, CI/CD pipelines, artifact registries, and custom workloads.
Enhance observability through intelligent logging, metrics, tracing, and alerting. Ensuring systems are transparent and actionable in real time.
Champion automation by eliminating repetitive tasks, from deployment workflows to security audits, through scripting and tooling.
Elevate the developer experience for 80+ engineers and researchers by streamlining secure, reliable workflows across hybrid and cloud-native platforms.
Take ownership of IAM governance across platforms like Azure AD and AWS IAM. Implement lifecycle automation, auditing, and access controls.
Foster a culture of operational excellence with strong practices around security, incident management, and resilience engineering.
Act as a trusted partner to developers and researchers, enabling their speed and innovation without compromising stability.

Key Skills:

Experience in Site Reliability Engineering, DevOps, or Systems Engineering within fast-paced, technically demanding environments.
Strong background in Linux systems and cloud infrastructure, with hands-on experience in AWS (primary) and Azure environments.
Solid command of Kubernetes and container orchestration in production environments.
Expertise in Infrastructure as Code tools such as Ansible, building reproducible, scalable infrastructure is second nature to you.
Deep experience in observability and incident response: you know how to set up effective monitoring, handle incidents, and lead blameless post-mortems.
A security-first mindset, especially when it comes to protecting distributed systems and developer workflows.
Proven ability to support and optimize CI/CD pipelines, container image builds, and artifact lifecycle management.
Strong communication and collaboration skills. You build trust across teams and advocate for thoughtful, scalable solutions.
Bonus if you've worked with event-driven architectures using technologies like Kafka.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Cloud Site Reliability Engineer - AWS & Azure

Responsibilities

Oversee the design and improvement of infrastructure using SRE best practices, including IaC, recovery automation, and systems that detect and resolve issues independently.
Manage and fine-tune critical services across both cloud and on-prem environments: Kubernetes clusters, CI/CD pipelines, artifact registries, and custom workloads.
Enhance observability through intelligent logging, metrics, tracing, and alerting. Ensuring systems are transparent and actionable in real time.
Champion automation by eliminating repetitive tasks, from deployment workflows to security audits, through scripting and tooling.
Elevate the developer experience for 80+ engineers and researchers by streamlining secure, reliable workflows across hybrid and cloud-native platforms.
Take ownership of IAM governance across platforms like Azure AD and AWS IAM. Implement lifecycle automation, auditing, and access controls.
Foster a culture of operational excellence with strong practices around security, incident management, and resilience engineering.
Act as a trusted partner to developers and researchers, enabling their speed and innovation without compromising stability.

Experience

Experience in Site Reliability Engineering, DevOps, or Systems Engineering within fast-paced, technically demanding environments.
Strong background in Linux systems and cloud infrastructure, with hands-on experience in AWS (primary) and Azure environments.
Solid command of Kubernetes and container orchestration in production environments.
Expertise in Infrastructure as Code tools such as Ansible, building reproducible, scalable infrastructure is second nature to you.
Deep experience in observability and incident response: you know how to set up effective monitoring, handle incidents, and lead blameless post-mortems.
A security-first mindset, especially when it comes to protecting distributed systems and developer workflows.
Proven ability to support and optimize CI/CD pipelines, container image builds, and artifact lifecycle management.
Strong communication and collaboration skills. You build trust across teams and advocate for thoughtful, scalable solutions.
Bonus if you've worked with event-driven architectures using technologies like Kafka.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)