Negotiable
Outside
Remote
USA
Summary: The Sr. AI DevOps Engineer role focuses on developing an AI-driven toolkit to enhance the efficiency of technical support teams. This position requires expertise in Kubernetes and virtualization, with a strong emphasis on integrating AI technologies. The role is fully remote, targeting candidates residing in the US, specifically in CST or EST time zones. The ideal candidate will be self-motivated and capable of translating complex systems into assistive tools for support enablement.
Key Responsibilities:
- Design and implement an AI-powered toolkit integrated with Large Language Models (LLMs) and Retrieval Augmented Generation (RAG).
- Empower technical support teams across various areas including Kubernetes fundamentals, virtualization, mixed workload management, and resource optimization.
- Develop automation solutions for provisioning, scaling, and orchestration in web-scale environments.
- Conduct real-time log analysis and system patching for troubleshooting and updates.
- Create custom configurations for networking and storage tailored to enterprise deployments.
Key Skills:
- Deep experience in Kubernetes, OpenShift 4.18, and virtualization stacks.
- Proficiency in designing AI/LLM-based tools, particularly RAG frameworks.
- Strong understanding of support enablement systems and documentation standards.
- Self-motivated and comfortable with remote contract work.
- Ability to translate complex systems into AI-powered assistive tools.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Role: Sr. AI DevOps Engineer
Location: Remote (Must reside in the US in CST or EST)
Why Join Us?
Be at the forefront of AI and infrastructure enablement. In this role, you'll help build a next-gen AI-driven toolkit that transforms how technical support teams operate-streamlining onboarding, reducing dependency on traditional documentation, and enabling faster troubleshooting at scale.
Top Benefits:
-
100% remote flexibility
-
Long-term stability in a cutting-edge contract role
-
Work at the intersection of AI, virtualization, and enterprise-scale automation
-
Contribute to tools used by global technical support teams
Design and implement an AI-powered toolkit integrated with Large Language Models (LLMs) and Retrieval Augmented Generation (RAG), built to support hybrid IT environments. This comprehensive toolkit will empower technical support teams across areas including:
-
Kubernetes Fundamentals: Pods, deployments, namespaces, eviction, and network policies
-
Virtualization: OpenShift Virtualization, KVM, and VM orchestration
-
Mixed Workload Management: Best practices for managing VMs + containers
-
Resource Optimization: Memory, CPU, and storage allocation across workloads
-
Networking & Integration: Kubernetes networking, UDNs, F5 SPK
-
Persistent Storage & Fault Tolerance: Storage tuning, replication, and HA strategies
-
Live Migration: Seamless VM/database migration with uptime focus
-
Automation: Provisioning, scaling, orchestration in Webscale environments
-
AI/LLM Tooling: Contextual retrieval, chatbot pipelines, and automation flows
-
Troubleshooting & Updates: Real-time log analysis and system patching
-
Custom Configurations: Tailored networking/storage for enterprise deployments
-
High Availability: Resilient design for fault-tolerant multi-platform systems
-
Deep experience in Kubernetes, OpenShift 4.18, and virtualization stacks
-
Proficiency in designing AI/LLM-based tools, particularly RAG frameworks
-
Strong understanding of support enablement systems and documentation standards
-
Self-motivated, autonomous, and comfortable with remote contract work
-
Capable of translating complex systems into AI-powered assistive tools
-
Remote-first: Flexibility to work from any location
-
Engagement Type: Long-term contract with consistent deliverables
-
Impact Focus: Enable rapid onboarding and support scalability for global teams
Ready to lead innovation at the crossroads of AI and cloud infrastructure? Apply now and help build the toolkit that's redefining support operations in hybrid environments.