Site Reliability Engineer/Cloud Engineer 100% Remote

Site Reliability Engineer/Cloud Engineer 100% Remote

Posted Today by 1765682897

Negotiable
Outside
Remote
USA

Summary: The Site Reliability Engineer/Cloud Engineer is responsible for managing support operations and site reliability engineering tasks to ensure optimal system performance, availability, and resiliency. This role involves leading a team of engineers, monitoring system metrics, and collaborating with cross-functional teams to implement improvements. The engineer will also develop incident response protocols and drive automation initiatives to enhance operational efficiency. Strong technical expertise and leadership skills are essential for success in this position.

Key Responsibilities:

  • Manage a team of support engineers and SREs to provide technical support and address system issues promptly.
  • Monitor system performance and reliability metrics, identifying areas for improvement and implementing solutions.
  • Collaborate with cross-functional teams to optimize application performance and enhance system reliability.
  • Develop and maintain incident response procedures and protocols to minimize system downtime.
  • Conduct regular audits and assessments to ensure compliance with industry standards and best practices.
  • Lead the implementation of automation tools and processes to streamline support operations and enhance efficiency.
  • Provide technical expertise and guidance to team members, promoting a culture of continuous learning and development.

Key Skills:

  • Proficiency in site reliability engineering (SRE) principles and practices.
  • Strong background in system administration, networking, and cloud computing.
  • Experience with monitoring tools such as Prometheus, Grafana, and ELK stack.
  • Knowledge of containerization technologies like Docker and Kubernetes.
  • Ability to troubleshoot complex technical issues and perform root cause analysis.
  • Excellent communication skills and ability to work collaboratively in a team environment.
  • Strong project management and leadership skills to drive initiatives and deliver results efficiently.
  • Certifications in relevant areas such as AWS Certified DevOps Engineer or Google Professional Cloud DevOps Engineer are a plus.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Site Reliability Engineer/Cloud Engineer

Location:- US Remote


Job Summary
The Support Lead (SRE) is responsible for overseeing the support operations and site reliability engineering tasks, ensuring the effective functioning of systems and applications. The primary goal is to enhance system performance, availability, and resiliency. (1.) Key Responsibilities
1. Manage a team of support engineers and sres to provide technical support and address system issues promptly.
2. Monitor system performance and reliability metrics, identifying areas for improvement and implementing solutions.
3. Collaborate with cross functional teams to optimize application performance and enhance system reliability.
4. Develop and maintain incident response procedures and protocols to minimize system downtime.
5. Conduct regular audits and assessments to ensure compliance with industry standards and best practices.
6. Lead the implementation of automation tools and processes to streamline support operations and enhance efficiency.
7. Provide technical expertise and guidance to team members, promoting a culture of continuous learning and development.

Skill Requirements
1. Proficiency in site reliability engineering (sre) principles and practices.
2. Strong background in system administration, networking, and cloud computing.
3. Experience with monitoring tools such as prometheus, grafana, and elk stack.
4. Knowledge of containerization technologies like docker and kubernetes.
5. Ability to troubleshoot complex technical issues and perform root cause analysis.
6. Excellent communication skills and ability to work collaboratively in a team environment.
7. Strong project management and leadership skills to drive initiatives and deliver results efficiently.
8. Certifications in relevant areas such as aws certified devops engineer or google professional cloud devops engineer are a plus.