Negotiable
Outside
Remote
USA
Summary: The Application Cloud Engineer role involves maintaining cloud-based applications and infrastructure on AWS, requiring collaboration with development, operations, and security teams to ensure optimal performance and security. The position is remote and necessitates the ability to pass a background screening for Public Trust clearance. The ideal candidate should possess extensive experience with AWS and cloud deployment strategies, along with proficiency in various scripting languages. Responsibilities include managing AWS infrastructure, troubleshooting production systems, and implementing enhancements to containerized environments.
Key Responsibilities:
- Provision and manage AWS infrastructure using infrastructure as code (IaC) using tools such as Terraform and CloudFormation
- Monitor and troubleshoot production systems using AWS CloudWatch and other observability tools
- Collaborate with developers to containerize and deploy applications using ECS and Lambda
- Deploy applications across multiple environments (dev, staging, prod) and ensure consistency and stability
- Monitor deployments and system health using CloudWatch and other tools
- Implement rollback strategies and manage version control during deployments
- Troubleshoot and resolve deployment issues and improve pipeline performance and reliability
- Proficient with Python, Bash, YAML/JSON, Node.js, Lambda functions
- Perform daily health checks using AWS CLI or scheduled Lambda scripts to check health and log/report results
- Document deployment processes and infrastructure architecture
- Familiarity with image registries like Amazon ECR and CI/CD pipelines for container deployment
- Collaborate with development team and DevOps teams to ensure applications are stateless and fault-tolerant
- Implement enhancements to containerized environments on ECS, focusing on scalability, performance and observability
- Enhance container orchestration strategies, including auto-scaling, rolling deployments and upgrades
- Support feature branch testing, merge request validation and artifact promotion workflows
- Ensure pipeline security and compliance through automated code scanning and approval gates
- Responsible for remediation of OS-level, container and dependent vulnerabilities
- Orchestrate failover and restoration of ECS/ EKS services, Lambda functions, databases and other infrastructure components
- Test and document regional failover playbooks and recovery runbooks
- Ensure compliance with RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements
- Participate in on-call rotations to support 24/7 production systems and respond to incidents as they arise
- Diagnose and resolve production issues related to cloud services, container orchestration, databases and CI/CD pipelines
- Follow and improve incident response playbooks, escalation procedures and communication workflows
- Automate common operational tasks and improve alert accuracy to reduce on-call fatigue
- Log incidents, changes, and operational metrics in tracking systems
Key Skills:
- BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
- 5+ years of IT experience. 2+ years of hands-on experience with AWS and cloud-based deployment strategies
- Proficient in scripting languages like Python, Bash and Node.js.
- Hands-on experience with CI/CD tools (GitHub, GitLab, Kubernetes, DevOps, CI)
- Knowledge of disaster recovery planning and implementation
- AWS or relevant Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
- Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
- Proficient with monitoring, alerting and logging tools
- Strong written and verbal communication skills for technical and non-technical stakeholders
- Excellent analytical and problem-solving skills
- Must be able to obtain and maintain a Public Trust clearance
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Application - Cloud Engineer
- Location: Remote
- Eastern, Central and Mountain time zones.
- ship is Required - Ability to pass enhanced background screen (criminal, financial, drug) for Public Trust clearance.
Description:
Client is seeking a skilled and motivated Application Cloud Engineer to join our dynamic team. The ideal candidate will be responsible for maintaining cloud-based applications and infrastructure on AWS. You will work closely with development, operations and security teams to ensure the scalability, performance and security of cloud applications.
Responsibilities:
- Provision and manage AWS infrastructure using infrastructure as code (IaC) using tools such as Terraform and CloudFormation
- Monitor and troubleshoot production systems using AWS CloudWatch and other observability tools
- Collaborate with developers to containerize and deploy applications using ECS and Lambda
- Deploy applications across multiple environments (dev, staging, prod) and ensure consistency and stability
- Monitor deployments and system health using CloudWatch and other tools
- Implement rollback strategies and manage version control during deployments
- Troubleshoot and resolve deployment issues and improve pipeline performance and reliability
- Proficient with Python, Bash, YAML/JSON, Node.js, Lambda functions
- Perform daily health checks using AWS CLI or scheduled Lambda scripts to check health and log/report results
- Document deployment processes and infrastructure architecture
- Familiarity with image registries like Amazon ECR and CI/CD pipelines for container deployment
- Collaborate with development team and DevOps teams to ensure applications are stateless and fault-tolerant
- Implement enhancements to containerized environments on ECS, focusing on scalability, performance and observability
- Enhance container orchestration strategies, including auto-scaling, rolling deployments and upgrades
- Support feature branch testing, merge request validation and artifact promotion workflows
- Ensure pipeline security and compliance through automated code scanning and approval gates
- Responsible for remediation of OS-level, container and dependent vulnerabilities
- Orchestrate failover and restoration of ECS/ EKS services, Lambda functions, databases and other infrastructure components
- Test and document regional failover playbooks and recovery runbooks
- Ensure compliance with RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements
- Participate in on-call rotations to support 24/7 production systems and respond to incidents as they arise
- Diagnose and resolve production issues related to cloud services, container orchestration, databases and CI/CD pipelines
- Follow and improve incident response playbooks, escalation procedures and communication workflows
- Automate common operational tasks and improve alert accuracy to reduce on-call fatigue
- Log incidents, changes, and operational metrics in tracking systems
Required Qualifications:
- BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
- 5+ years of IT experience. 2+ years of hands-on experience with AWS and cloud-based deployment strategies
- Proficient in scripting languages like Python, Bash and Node.js.
- Hands-on experience with CI/CD tools (GibHub, GitLab, Kubernettes, DevOps, CI)
- Knowledge of disaster recovery planning and implementation
- AWS or relevant Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
- Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
- Proficient with monitoring, alerting and logging tools
- Strong written and verbal communication skills for technical and non-technical stakeholders
- Excellent analytical and problem-solving skills
- Must be able to obtain and maintain a Public Trust clearance
Preferred Qualifications:
- Familiarity with container orchestration (Docker, ECS, Kubernetes)
- Knowledge of ITIL practice or incident management frameworks