SRE Engineer with ELK

SRE Engineer with ELK

Posted Today by Ampstek

Negotiable
Undetermined
Undetermined
Birmingham, England, United Kingdom

Summary: The role of Site Reliability Engineer (SRE) involves ensuring the reliability, scalability, and performance of cloud-native applications through strong expertise in Docker, Kubernetes, and the ELK stack. The SRE will collaborate with development and operations teams to design and maintain highly available systems, automate operational tasks, and improve CI/CD processes. Key responsibilities include monitoring system performance, troubleshooting production issues, and participating in incident response activities. The ideal candidate will have a proven background in SRE, DevOps, or Platform Engineering with hands-on experience in relevant technologies.

Key Responsibilities:

  • Design, implement, and maintain highly available and scalable systems.
  • Manage containerized applications using Docker and orchestrate them with Kubernetes.
  • Monitor system performance, availability, and reliability using logging and monitoring tools.
  • Implement and maintain centralized logging solutions using the ELK stack (Elasticsearch, Logstash, Kibana).
  • Troubleshoot production issues, perform root cause analysis, and drive permanent fixes.
  • Automate operational tasks and improve system reliability through scripting and tooling.
  • Collaborate with development teams to improve CI/CD pipelines and deployment processes.
  • Participate in on-call rotations and incident response activities.
  • Define and track SLIs, SLOs, and SLAs to improve service reliability.

Key Skills:

  • Proven experience as an SRE / DevOps / Platform Engineer.
  • Strong hands-on experience with Docker and Kubernetes (deployment, scaling, troubleshooting).
  • Working knowledge of the ELK stack for log aggregation, search, and visualization.
  • Experience with Linux/Unix systems and networking fundamentals.
  • Familiarity with cloud platforms such as AWS, Azure, or GCP.
  • Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI).
  • Scripting experience in Python, Bash, or similar languages.
  • Understanding of monitoring and alerting tools (Prometheus, Grafana, etc.).

Salary (Rate): undetermined

City: Birmingham

Country: United Kingdom

Working Arrangements: undetermined

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Job Summary: We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in Docker, Kubernetes , and working knowledge of the ELK stack . The ideal candidate will be responsible for ensuring the reliability, scalability, performance, and monitoring of cloud-native applications while collaborating closely with development and operations teams.

Key Responsibilities:

  • Design, implement, and maintain highly available and scalable systems.
  • Manage containerized applications using Docker and orchestrate them with Kubernetes .
  • Monitor system performance, availability, and reliability using logging and monitoring tools.
  • Implement and maintain centralized logging solutions using the ELK stack (Elasticsearch, Logstash, Kibana) .
  • Troubleshoot production issues, perform root cause analysis, and drive permanent fixes.
  • Automate operational tasks and improve system reliability through scripting and tooling.
  • Collaborate with development teams to improve CI/CD pipelines and deployment processes.
  • Participate in on-call rotations and incident response activities.
  • Define and track SLIs, SLOs, and SLAs to improve service reliability.

Required Skills & Qualifications:

  • Proven experience as an SRE / DevOps / Platform Engineer .
  • Strong hands-on experience with Docker and Kubernetes (deployment, scaling, troubleshooting).
  • Working knowledge of the ELK stack for log aggregation, search, and visualization.
  • Experience with Linux/Unix systems and networking fundamentals.
  • Familiarity with cloud platforms such as AWS, Azure, or GCP .
  • Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI).
  • Scripting experience in Python, Bash, or similar languages .
  • Understanding of monitoring and alerting tools (Prometheus, Grafana, etc.).