Data Engineer-W2only

Data Engineer-W2only

Posted Today by 1758185650

Negotiable
Outside
Remote
USA

Summary: The Data Engineer role focuses on leveraging AWS and Databricks technologies to build and optimize data pipelines and workflows. The position requires expertise in data integration, automation, and monitoring to ensure efficient data processing and management. Candidates should possess strong analytical skills and a solid understanding of DevOps principles. This role is remote and classified as outside IR35.

Key Responsibilities:

  • Proficiency with core AWS services like EC2, S3, RDS, Lambda, and VPC.
  • Experience with AWS data services Glue and EMR.
  • Knowledge of AWS Database Migration Service (DMS) for migrating databases to AWS.
  • Understanding of Change Data Capture (CDC) techniques.
  • Understanding of AWS security best practices, IAM, and encryption.
  • Strong analytical skills in PySpark & Spark SQL for big data processing and analysis.
  • Expertise in using Delta Live Tables for building data pipelines.
  • Utilization of Databricks Notebooks for data analysis and setting up Databricks Workflows.
  • Experience integrating Databricks with AWS services.
  • CI/CD pipelines using Azure Pipelines.
  • Proficiency with Azure Repos and Git for version control.
  • Scripting and automation using PowerShell, Bash, or Python.
  • Experience with Terraform for managing AWS and Azure infrastructure.
  • Integrating on-prem data with AWS and Databricks.
  • Optimize AWS services and Databricks for performance and cost-efficiency.
  • Setting up monitoring and logging using AWS CloudWatch.

Key Skills:

  • Proficiency with AWS core services (EC2, S3, RDS, Lambda, VPC).
  • Experience with AWS Glue and EMR.
  • Knowledge of AWS DMS.
  • Understanding of CDC techniques.
  • Knowledge of AWS security best practices.
  • Strong analytical skills in PySpark & Spark SQL.
  • Expertise in Delta Live Tables.
  • Experience with Databricks Notebooks and Workflows.
  • Integration experience with AWS services.
  • CI/CD pipeline experience with Azure Pipelines.
  • Version control proficiency with Azure Repos and Git.
  • Scripting skills in PowerShell, Bash, or Python.
  • Experience with Terraform.
  • Data integration skills.
  • Optimization and monitoring skills using AWS CloudWatch.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: Other

Detailed Description From Employer:

AWS (Amazon Web Services):

    1. Core Services: Proficiency with core AWS services like EC2, S3, RDS, Lambda, and VPC.
    2. Data Services: Experience with AWS data services Glue and EMR.
    1. AWS DMS: Knowledge of AWS Database Migration Service (DMS) for migrating databases to AWS.
    1. CDC: Understanding of Change Data Capture (CDC) techniques to capture and replicate changes from source databases to target databases.
    1. Security: Understanding of AWS security best practices, IAM, and encryption.

Databricks:

    1. PySpark & Spark SQL: Strong analytical skills in PySpark & Spark SQL for big data processing and analysis.
    1. Delta Live Tables: Expertise in using Delta Live Tables for building reliable and scalable data pipelines.
    1. Notebooks: Strong utilization of Databricks Notebooks for data analysis.
    2. Workflows : Setting up and monitoring Databricks Workflows.
    1. Data Integration: Experience integrating Databricks with AWS services.

DevOps Principles:

    1. CI/CD Pipelines: CI/CD pipelines using Azure Pipelines.
    1. Version Control: Proficiency with Azure Repos and Git for version control.
    1. Automation: Scripting and automation using PowerShell, Bash, or Python. Automating the build, test, and deployment processes

Infrastructure as Code (IaC):

    1. Terraform: Experience with Terraform for managing AWS and Azure infrastructure.

On Prem integration with AWS

    1. Integrating on prem data with AWS and Databricks.
    2. Thoroughly test and validate the data to ensure it has been transferred correctly and is fully functional.

Optimization and Monitoring:

    1. Optimize AWS services and Databricks for performance and cost-efficiency.
    1. Proficiency in setting up monitoring and logging using tools like AWS CloudWatch to track the performance and health of the complete data flow.