Negotiable
Outside
Remote
USA
Summary: The Zabbix Architect role focuses on designing, implementing, and optimizing enterprise-grade monitoring solutions using Zabbix. The ideal candidate will possess extensive hands-on experience with Zabbix architecture and integration with external platforms. This position requires strong technical leadership and the ability to develop scalable monitoring strategies across cross-functional teams. The role is remote and classified as outside IR35.
Key Responsibilities:
- Lead the design and deployment of Zabbix architecture to support enterprise-scale monitoring (multi-datacenter, hybrid cloud, and containerized environments).
- Define monitoring strategies and data collection standards for servers, applications, networks, databases, and cloud resources.
- Architect highly available and scalable Zabbix clusters (proxy, distributed nodes, failover, HA/DR strategies).
- Deploy and configure Zabbix server, proxies, agents, templates, and custom checks.
- Integrate Zabbix with third-party systems (ServiceNow, Jira, Ansible, Puppet, Grafana, Slack, Microsoft Teams, email/SMS gateways, etc.).
- Develop and maintain custom integrations using Zabbix API, webhooks, and scripts (Python, Bash, or PowerShell).
- Automate provisioning and configuration of Zabbix using IaC tools (Terraform, Ansible, Puppet, or Chef).
- Establish best practices for thresholds, alerts, dashboards, and escalation policies.
- Tune Zabbix performance for large-scale monitoring (e.g., DB optimization, history/trends housekeeping, partitioning).
- Manage security, role-based access control (RBAC), and audit requirements within Zabbix.
- Ensure monitoring solution aligns with compliance and operational governance standards.
- Work closely with DevOps, SRE, Security, and Infrastructure teams to define monitoring KPIs and SLAs.
- Mentor and provide training to engineers and administrators on Zabbix usage and integrations.
- Act as subject matter expert (SME) for Zabbix, advising leadership and stakeholders on roadmap, capabilities, and enhancements.
Key Skills:
- Proven experience as a Zabbix Architect / Senior Engineer / SME in large-scale enterprise environments.
- Strong understanding of Zabbix components, architecture, and scaling strategies (HA, clustering, proxies, etc.).
- Hands-on experience with Zabbix API, scripting, and integration development.
- Expertise in monitoring Linux, Windows, cloud (AWS, Azure, Google Cloud Platform), containers (Kubernetes, Docker), and databases.
- Proficiency with automation/configuration tools (Terraform, Ansible, Puppet, Chef).
- Strong knowledge of networking, SNMP, IPMI, JMX, and cloud monitoring protocols.
- Familiarity with ITSM/ITOM tools (ServiceNow, Remedy, Jira Service Management).
- Excellent communication skills with ability to collaborate across multiple technical teams.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
We are looking for a Zabbix Architect with deep expertise in designing, implementing, and optimizing enterprise-grade monitoring solutions. The ideal candidate will have extensive hands-on experience with Zabbix architecture and proven ability to integrate Zabbix with external platforms, tools, and enterprise systems. This role requires strong technical leadership, solution design skills, and the ability to enable cross-functional teams with scalable monitoring strategies.
Key Responsibilities:
Architecture & Design
- Lead the design and deployment of Zabbix architecture to support enterprise-scale monitoring (multi-datacenter, hybrid cloud, and containerized environments).
- Define monitoring strategies and data collection standards for servers, applications, networks, databases, and cloud resources.
- Architect highly available and scalable Zabbix clusters (proxy, distributed nodes, failover, HA/DR strategies).
Implementation & Integration
- Deploy and configure Zabbix server, proxies, agents, templates, and custom checks.
- Integrate Zabbix with third-party systems (ServiceNow, Jira, Ansible, Puppet, Grafana, Slack, Microsoft Teams, email/SMS gateways, etc.).
- Develop and maintain custom integrations using Zabbix API, webhooks, and scripts (Python, Bash, or PowerShell).
- Automate provisioning and configuration of Zabbix using IaC tools (Terraform, Ansible, Puppet, or Chef).
Optimization & Governance
- Establish best practices for thresholds, alerts, dashboards, and escalation policies.
- Tune Zabbix performance for large-scale monitoring (e.g., DB optimization, history/trends housekeeping, partitioning).
- Manage security, role-based access control (RBAC), and audit requirements within Zabbix.
- Ensure monitoring solution aligns with compliance and operational governance standards.
Collaboration & Leadership
- Work closely with DevOps, SRE, Security, and Infrastructure teams to define monitoring KPIs and SLAs.
- Mentor and provide training to engineers and administrators on Zabbix usage and integrations.
- Act as subject matter expert (SME) for Zabbix, advising leadership and stakeholders on roadmap, capabilities, and enhancements.
Qualifications:
- Proven experience as a Zabbix Architect / Senior Engineer / SME in large-scale enterprise environments.
- Strong understanding of Zabbix components, architecture, and scaling strategies (HA, clustering, proxies, etc.).
- Hands-on experience with Zabbix API, scripting, and integration development.
- Expertise in monitoring Linux, Windows, cloud (AWS, Azure, Google Cloud Platform), containers (Kubernetes, Docker), and databases.
- Proficiency with automation/configuration tools (Terraform, Ansible, Puppet, Chef).
- Strong knowledge of networking, SNMP, IPMI, JMX, and cloud monitoring protocols.
- Familiarity with ITSM/ITOM tools (ServiceNow, Remedy, Jira Service Management).
- Excellent communication skills with ability to collaborate across multiple technical teams.
Preferred:
- Zabbix Certified Specialist / Professional.
- Experience integrating Zabbix with Grafana dashboards for advanced visualization.
- Prior background in designing migration strategies from other monitoring tools (Nagios, Prometheus, SolarWinds, Datadog, etc.) to Zabbix.
- Experience in multi-tenant monitoring environments.