Negotiable
Undetermined
Remote
Remote
Summary: The HashiCorp Vault Consultant role involves conducting architecture reviews, risk and gap analyses, and assessing observability and health metrics related to Vault and its dependent services. The consultant will also focus on logging and traceability to ensure incidents can be effectively managed and will prioritize quick wins and scaling paths for improvements. This position is remote, allowing for flexibility in work arrangements. The consultant will play a critical role in enhancing security, resilience, and operational efficiency within the organization.
Key Responsibilities:
- Assess current Vault architecture including HA, storage backend, and network boundaries within AWS/Azure.
- Identify gaps in security policies, authentication methods, and operational ownership.
- Review metrics, health checks, and alerting for Vault/Consul and dependent services.
- Audit application logs and distributed tracing for incident management and compliance.
- Prioritize low effort, high impact fixes and develop a roadmap for larger changes.
Key Skills:
- Experience with HashiCorp Vault and its architecture.
- Strong understanding of security policies and authentication methods.
- Knowledge of observability metrics and health checks.
- Experience with logging, traceability, and incident management.
- Ability to prioritize and implement quick wins and scaling strategies.
Salary (Rate): £80/hr
City: undetermined
Country: undetermined
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Architecture review-
Assess current Vault (HA, storage backend, replication/federation if used, network boundaries, and how they sit in current AWS/Azure footprint)
Risk and gap analysis-
Identify gaps in security (policies, auth methods, secrets lifecycle), resilience (backup, restore, DR), and operational ownership
Observability and health -
Review metrics, health checks, and alerting tied to Vault/Consul and dependent services; align signals with real failure modes and SLO-minded thresholds
Logging and Traceability-
Review and audit and application logs (and any distributed tracing) so incidents support clear RCA, correlation IDs across PAM/Vault paths where relevant, and retention that meets compliance without drowning operators
Quick wins and scaling paths-
Prioritize low effort, high impact fixes (configure, monitoring, runbooks) and a short roadmap for larger changes (capacity, topology, or platform evolution) without boiling the ocean