Negotiable
Outside
Remote
USA
Summary: The Site Reliability Engineer role focuses on enhancing system reliability and scalability through effective troubleshooting and automation within cloud environments, particularly Azure. The position requires collaboration with cross-functional teams to improve architecture and service quality while adapting to business needs. A strong emphasis is placed on continuous learning and innovation in technology to drive team efficiency and effectiveness. Financial acumen is also valued for identifying cost-effective solutions.
Key Responsibilities:
- Strong troubleshooting skills coupled with making data-driven decisions during incidents to improve time to detect and resolve issues.
- Strong understanding of cloud computing platforms (e.g. Azure) and cloud-native setups (e.g. AKS, serverless).
- Ability to work with cross-functional Development, QA, and Operations teams to understand the underlying architecture and help improve its reliability and scalability.
- Networking knowledge and in-depth understanding of network concepts.
- Lead by example to drive the team with the latest innovation and tools to ensure the team stays current with the latest technologies and implementation patterns.
- Able to identify areas for efficiencies within the team and upskill self and staff when necessary.
- Conduct disaster recovery and controlled failure testing to improve resiliency.
- Develop solutions to enhance TechOps and product teams capabilities and visibility for the respective applications.
Key Skills:
- Strong troubleshooting skills and data-driven decision-making.
- Understanding of cloud computing platforms, particularly Azure.
- Ability to communicate effectively with stakeholders, product teams, and internal teams.
- Financial acumen for identifying cost-effective service solutions.
- Proficiency in problem-solving and critical thinking.
- Knowledge of Firmwide Technology Governance, processes, and policies.
- Ability to adjust quickly to business needs while ensuring quality.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Detailed Description From Employer:
Position: Site Reliability Engineer
Location: US and Canada _ Remote
Duration: Long Term Contract
Please share the resume at
Qualifications
Strong troubleshooting skills coupled with making data-driven decisions during incidents, to improve time to detect and resolve issues
Strong understanding of cloud computing platforms (e.g. Azure) and cloud-native setups (AKS, serverless, etc.)
A "can do" attitude is necessary, combined with a deep belief that everything can be automated, and systems must always be functional
Ability to work with cross-functional Development, QA and Operations teams to understand the underlying architecture, and help improve its reliability and scalability
Networking knowledge and in depth understanding of network concepts
Deep understanding of the platform s use-case and its application.
Able to communicate effectively with stakeholders, product teams and internal teams
Lead by example to drive the team with the latest innovation and tools to ensure the team stays current with the latest technologies and implementation patterns
Able to adjust quickly to business needs while ensuring quality
Able to identify areas for efficiencies within the team and upskilling self and staff when necessary.
Able to communicate effectively with stakeholders, product teams and internal teams
Financial acumen is valuable for identifying cost-effective service solutions
Proficiency in problem-solving. critical thinking to address complex service issues, ensure service quality, and efficiency
Apply knowledge to comply with the Firmwide Technology Governance, processes and policies
Conduct disaster recovery and controlled failure testing to improve resiliency
Champion and adhere to the Evolved PwC Professional
Responsible to learning the latest technologies as it relates to the applications they are supporting
Develop solutions to enhance TechOps and product teams capabilities and visibility for the respective applications