Negotiable
Outside
Remote
USA
Summary: The GenAI Data Automation Engineer will design and implement AI-driven automation solutions within AWS and Azure hybrid environments. This role focuses on building scalable data pipelines and automations that integrate cloud services and Generative AI for analytics and customer engagement. The ideal candidate will possess a mission-focused mindset and apply critical thinking to solve technical challenges effectively.
Key Responsibilities:
- Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions.
- Develop ETL/ELT processes to move data from multiple data systems including DynamoDB and SQL Server (AWS) and between AWS and Azure SQL systems.
- Integrate AWS Connect, Nice inContact CRM data into the enterprise data pipeline for analytics and operational reporting.
- Engineer, enhance ingestion pipelines with Apache Spark, Flume, Kafka for real-time and batch processing into Apache Solr, AWS Open Search platforms.
- Leverage Generative AI services and Frameworks (AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, LangChain) to create automated processes for vector generation and embedding from unstructured data to support Generative AI models.
- Automate data quality checks, metadata tagging, and lineage tracking.
- Enhance ingestion/ETL with LLM-assisted transformation and anomaly detection.
- Build conversational BI interfaces that allow natural language access to Solr and SQL data.
- Develop AI-powered copilots for pipeline monitoring and automated troubleshooting.
- Implement SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning to maximize performance.
- Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps for both data pipelines and GenAI model integration.
- Ensure security and compliance through IAM, KMS encryption, VPC isolation, RBAC, and firewalls.
- Support Agile DevOps processes with sprint-based delivery of pipeline and AI-enabled features.
Key Skills:
- BS in Computer Science or related field with 2+ years of data engineering, automation experiences.
- Hands-on experience with LLM, Generative AI frameworks using AWS Bedrock, Azure OpenAI or open source platform.
- Hands-on experience with SQL, SSIS, Python, Spark, Bash, Power shell, AWS/Azure CLIs.
- Experience with AWS services like S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB.
- Familiarity with Apache Flume, Kafka, Solr for large-scale data ingestion and search.
- Experience with integrating REST API calls in data pipelines and workflows.
- Familiarity with JIRA, GitHub / Azure DevOps / Jenkins for SDLC and CI/CD automation.
- Strong troubleshooting and performance optimization skills in SQL, Spark or other data engineering solutions.
- Experience operationalizing Generative AI (GenAI Ops) pipelines, including model deployment, monitoring, retraining, and lifecycle management for LLMs and AI-enabled data workflows.
- Good communication and presentation skills.
- Ability to obtain Public Trust clearance.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
GenAI Data Automation Engineer
Our client is seeking a GenAI Data Automation Engineer to design and implement innovative, AI-driven automation solutions across AWS and Azure hybrid environments. You will be responsible for building intelligent, scalable data pipelines and automations that integrate cloud services, enterprise tools, and Generative AI to support mission-critical analytics, reporting, and customer engagement platforms. Ideal candidate is mission focused, delivery oriented, and applies critical thinking to create innovative functions and solve technical issues.
In this role, you will:
- Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions.
- Develop ETL/ELT processes to move data from multiple data systems including DynamoDB SQL Server (AWS) and between AWS Azure SQL systems.
- Integrate AWS Connect, Nice inContact CRM data into the enterprise data pipeline for analytics and operational reporting.
- Engineer, enhance ingestion pipelines with Apache Spark, Flume, Kafka for real-time and batch processing into Apache Solr, AWS Open Search platforms.
- Leverage Generative AI services and Frameworks (AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, LangChain) to:
- Create automated processes for vector generation and embedding from unstructured data to support Generative AI models.
- Automate data quality checks, metadata tagging, and lineage tracking.
- Enhance ingestion/ETL with LLM-assisted transformation and anomaly detection.
- Build conversational BI interfaces that allow natural language access to Solr and SQL data.
- Develop AI-powered copilots for pipeline monitoring and automated troubleshooting.
- Implement SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning to maximize performance.
- Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps for both data pipelines and GenAI model integration.
- Ensure security and compliance through IAM, KMS encryption, VPC isolation, RBAC, and firewalls.
- Support Agile DevOps processes with sprint-based delivery of pipeline and AI-enabled features.
Required Qualifications:
- BS in Computer Science or related field with 2+ years of data engineering, automation experiences.
- Hands-on experience with LLM, Generative AI frameworks using AWS Bedrock, Azure OpenAI or open source platform.
- Hands-on experience with SQL, SSIS, Python, Spark, Bash, Power shell, AWS/Azure CLIs.
- Experience with AWS services like S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB.
- Familiarity with Apache Flume, Kafka, Solr for large-scale data ingestion and search.
- Experience with integrating REST API calls in data pipelines and workflows.
- Familiarity with JIRA, GitHub / Azure DevOps / Jenkins for SDLC and CI/CD automation.
- Strong troubleshooting and performance optimization skills in SQL, Spark or other data engineering solutions.
- Experience operationalizing Generative AI (GenAI Ops) pipelines, including model deployment, monitoring, retraining, and lifecycle management for LLMs and AI-enabled data workflows.
- Good communication and presentation skills.
- ship and ability to obtain Public Trust clearance.