Negotiable
Outside
Remote
USA
Summary: The Senior OpenStack DevOps Engineer role involves designing and implementing multi-tenant OpenStack infrastructure while automating the provisioning and lifecycle management of OpenStack services. The position requires extensive experience in operating large-scale OpenStack environments and proficiency in Infrastructure-as-Code tools. The engineer will also contribute to upstream OpenStack projects and ensure high availability and reliability of the platform. This is a remote position based in the USA.
Key Responsibilities:
- Design and implement multi-tenant OpenStack infrastructure aligned with open community standards, including network service isolation and tenant-aware resource scheduling
- Automate the provisioning, lifecycle management, and configuration of OpenStack services and supporting components using Ansible, Terraform, or Pulumi
- Extend and maintain integrations for core OpenStack services: Neutron, Octavia, Manila, Cinder, Nova, Ironic, Glance
- Contribute to upstream OpenStack projects when needed (bug fixes, driver enhancements, documentation)
- Implement continuous delivery pipelines for OpenStack updates, including patch management, service upgrade testing, and rollback procedures
- Develop automated monitoring, alerting, and healing mechanisms using GitOps principles and observability stacks (e.g., Prometheus, Loki, Grafana)
- Harden services for high availability, disaster recovery, and scale-out operations
- Perform deep-dive troubleshooting and performance analysis of OpenStack services across control and data planes
- Participate in on-call rotation, incident response, and root cause analysis for platform reliability issues
Key Skills:
- 8-9+ years of experience operating and automating large-scale OpenStack cloud environments
- Expert-level proficiency in Infrastructure-as-Code with Ansible, Terraform, or Pulumi
- Strong hands-on knowledge of Neutron, Octavia, Manila, Cinder, Nova, Ironic, and Glance
- Strong Linux (RHEL/CentOS/Ubuntu) systems engineering background with advanced scripting in Python, Bash, or Go
- Fluency with Git, CI/CD pipelines, and automated test frameworks
- Strong understanding of L2/L3 networking, SDN overlays, load balancing, and storage protocols
- Demonstrated success building or maintaining multi-region or high-availability OpenStack clusters
- Experience with container technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes) is a plus
- Ability to write technical documentation and contribute to community wikis or knowledge bases
- Bachelor's degree in computer science, IT, Engineering, or a related field preferred; equivalent experience and relevant industry certifications will also be considered
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Senior OpenStack DevOps Engineer
California Based Client
remote
Core skills
-8-9+ years of experience operating and automating large-scale OpenStack cloud environments, preferably in community-driven or upstream-contributing teams
-Expert-level proficiency in Infrastructure-as-Code with Ansible, Terraform, or Pulumi
-Strong hands-on knowledge of Neutron, Octavia, Manila, Cinder, Nova, Ironic, and Glance
-Strong Linux (RHEL/CentOS/Ubuntu) systems engineering background with advanced scripting in Python, Bash, or Go
-Fluency with Git, CI/CD pipelines, and automated test frameworks
-Strong understanding of L2/L3 networking, SDN overlays (VXLAN, Geneve), load balancing (Octavia), and storage protocols (SDS) (iSCSI, NFS, CEPH)
-Demonstrated success building or maintaining multi-region or high-availability OpenStack clusters
-Experience with container technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes) is a plus
-Ability to write technical documentation and contribute to community wikis or knowledge bases
Preferred Qualifications:
Contributions to upstream OpenStack codebases or participation in SIGs/WGs
Familiarity with RBAC, Keystone federation, Barbican (secrets management), and Ceilometer/Gnocchi/Aodh
Understanding of security best practices for tenant isolation, microsegmentation, and compliance (e.g., CIS, NIST)
Background in telco, edge cloud, or large enterprise infrastructure environments
Experience building and maintaining automated test environments for OpenStack upgrades and validation (e.g., Tempest, Rally, or custom test harnesses)
Bachelor s degree in computer science, IT, Engineering, or a related field preferred; equivalent experience and relevant industry certifications will also be considered
Key Responsibilities:
Design and implement multi-tenant OpenStack infrastructure aligned with open community standards, including network service isolation and tenant-aware resource scheduling
Automate the provisioning, lifecycle management, and configuration of OpenStack services and supporting components using Ansible, Terraform, or Pulumi
Extend and maintain integrations for core OpenStack services:
Neutron: Custom network plugins/drivers, L2/L3 routing, BGP, DHCP, and tenant network segmentation (VLAN, VXLAN, Geneve).
Octavia: HA load balancing with amphora and provider drivers, TLS offloading, Layer 4/7 routing.
Manila: Share drivers for NFS/CIFS, tenant access control, storage backends.
Cinder: Block storage provisioning, volume snapshots, multi-attach, backend tuning.
Nova: Compute resource scheduling, NUMA/cpu pinning, SR-IOV and PCI passthrough.
Ironic: Bare metal provisioning workflows, BIOS/IPMI automation, PXE and UEFI boot configuration.
Glance: Image lifecycle management, backing store optimization, image caching strategies
Contribute to upstream OpenStack projects when needed (bug fixes, driver enhancements, documentation)
Implement continuous delivery pipelines for OpenStack updates, including patch management, service upgrade testing, and rollback procedures
Develop automated monitoring, alerting, and healing mechanisms using GitOps principles and observability stacks (e.g., Prometheus, Loki, Grafana)
Harden services for high availability, disaster recovery, and scale-out operations
Perform deep-dive troubleshooting and performance analysis of OpenStack services across control and data planes
Participate in on-call rotation, incident response, and root cause analysis for platform reliability issues