Negotiable
Outside
Remote
USA
Summary: The ITSM Service Delivery / SRE Consultant is responsible for ensuring end-to-end service availability and service level management across various environments, including on-premises and cloud. This role involves monitoring infrastructure health, collaborating with multiple teams to align on service levels, and developing service availability plans. The consultant will also present metrics and trends to both technical and executive audiences while embedding availability practices into management workflows.
Key Responsibilities:
- Serve as the primary point of accountability for end-to-end service availability and service level management, spanning on-premises, cloud, and third-party integrations.
- Monitor critical infrastructure and application health, leveraging (and in some cases, creating) advanced analytics and real-time dashboards to detect early warning signs and eliminate single points of failure.
- Partner with Architects, DevOps, SRE, and Application teams to drive awareness and alignment with Service Level and Availability and move towards a unified IT Operations model across the enterprise.
- Develop and maintain Service Availability Plans, incorporating business priorities, technical dependencies, and risk mitigation strategies.
- Own and evolve metrics for service uptime, reliability, MTTR/MTTI, and user-impacting events. Present trends and recommendations to both technical staff and executive leadership.
- Embed availability practices into Change Management, Release, and Problem Management workflows, ensuring risks are surfaced and planned for up front.
Key Skills:
- Bachelor s degree or equivalent practical experience in IT, Computer Science, Engineering, or a related field.
- 10+ years of hands-on experience in IT Operations, SRE, or Availability Management within enterprise-scale environments.
- Proven track record managing Service Level Management/Availability Management and in-depth experience implementing these services in alignment with ITIL.
- Deep understanding of IT infrastructure (compute, storage, network), cloud platforms (AWS/Azure), and modern application architectures (microservices, containerization).
- Proven experience with ITIL/ITSM best practices around Availability and Service Level Management, also in relation to Incident Management.
- Experience with monitoring, alerting, and analytics tools (e.g., ServiceNow, PagerDuty, PowerBI, Datadog, Splunk).
- Exceptional written and verbal communication skills; able to translate technical details for senior leaders and non-technical stakeholders.
- Analytical mindset: able to spot trends, correlate data, and identify improvement opportunities independently.
- Executive presence and the confidence to lead discussions, challenge assumptions, and drive decisions in high-visibility scenarios.
- Programming/scripting ability (Python, PowerShell, etc.) is a plus.
- Must be able to work independently with little oversight and progress quickly.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Job Title: ITSM Service Delivery / SRE Consultant Location: RemoteKey Responsibilities:
- Serve as the primary point of accountability for end-to-end service availability and service level management, spanning on-premises, cloud, and third-party integrations.
- Monitor critical infrastructure and application health, leveraging (and in some cases, creating) advanced analytics and real-time dashboards to detect early warning signs and eliminate single points of failure.
- Partner with Architects, DevOps, SRE, and Application teams to drive awareness and alignment with Service Level and Availability and move towards a unified IT Operations model across the enterprise.
- Develop and maintain Service Availability Plans, incorporating business priorities, technical dependencies, and risk mitigation strategies.
- Own and evolve metrics for service uptime, reliability, MTTR/MTTI, and user-impacting events. Present trends and recommendations to both technical staff and executive leadership.
- Embed availability practices into Change Management, Release, and Problem Management workflows, ensuring risks are surfaced and planned for up front.
- Bachelor s degree or equivalent practical experience in IT, Computer Science, Engineering, or a related field.
- 10+ years of hands-on experience in IT Operations, SRE, or Availability Management within enterprise-scale environments.
- Proven track record managing Service Level Management/Availability Management and in-depth experience implementing these services in alignment with ITIL.
- Deep understanding of IT infrastructure (compute, storage, network), cloud platforms (AWS/Azure), and modern application architectures (microservices, containerization).
- Proven experience with ITIL/ITSM best practices around Availability and Service Level Management, also in relation to Incident Management.
- Experience with monitoring, alerting, and analytics tools (e.g., ServiceNow, PagerDuty, PowerBI, Datadog, Splunk).
- Exceptional written and verbal communication skills; able to translate technical details for senior leaders and non-technical stakeholders.
- Analytical mindset: able to spot trends, correlate data, and identify improvement opportunities independently.
- Executive presence and the confidence to lead discussions, challenge assumptions, and drive decisions in high-visibility scenarios.
- Programming/scripting ability (Python, PowerShell, etc.) is a plus.
- Must be able to work independently with little oversight and progress quickly.
Mandatory:
- ITIL, AWS/Azure, or related certifications. Candidates holding these certifications will be preferred and prioritized.
- Experience with automation and orchestration tools.
- Familiarity with DevOps/DevSecOps, SRE, and Monitoring/Observability platforms