Data Engineer AWS-

Data Engineer AWS-

Posted 5 days ago by Alphayotta

Negotiable
Undetermined
Undetermined
London Area, United Kingdom

Summary: The Data Engineer role focuses on building and maintaining data pipelines using Pyspark and Python, with a specific task of migrating Jupyter Notebooks to an in-house Pyspark framework for AWS Glue jobs. The position requires strong documentation skills and the ability to perform reconciliation checks to ensure data integrity. Candidates should have a solid background in data engineering concepts and experience with AWS and modern data warehouse platforms.

Key Responsibilities:

  • Build Pyspark and Python data pipelines.
  • Write and maintain documentation of technical architecture.
  • Identify areas for quick wins to improve the experience of end users.
  • Migrate an existing Jupyter Notebook from OmniAI to a firm’s in-house Pyspark framework for orchestrating AWS Glue jobs in an estimated 3 month’s time.
  • Ensure the final output in the new pipeline on the new setup matches with the existing pipeline.
  • Perform reconciliation checks, identify and resolve any differences.

Key Skills:

  • Formal training or certification on data engineering concepts and 3+ years applied experience.
  • Proficiency in one or more programming languages such as Python and Java.
  • Ability to design and implement scalable data pipelines for batch and real-time data processing.
  • Experience with AWS.
  • Experience working with modern data warehouse platforms like Amazon Redshift.
  • Experience in developing, debugging, and maintaining code in a large corporate environment.
  • Overall knowledge of the Software Development Life Cycle.
  • Solid understanding of agile methodologies such as CI/CD, Application Resiliency, and Security.
  • Certifications in relevant technologies or platforms, such as AWS Certified Big Data Engineer Associate can be advantageous.
  • Relevant industry experience, preferably in a data engineering role.

Salary (Rate): undetermined

City: London Area

Country: United Kingdom

Working Arrangements: undetermined

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Job Responsibilities

  • Build Pyspark and python data pipelines.
  • Write and maintain documentation of technical architecture.
  • Identify areas for quick wins to improve the experience of end users.
  • Migrate an existing Jupyter Notebook from OmniAI to a firm’s inhouse Pyspark framework for orchestrating AWS Glue jobs in an estimated 3 month’s time.
  • Ensure the final output in the new pipeline on the new setup matches with the existing pipeline.
  • Perform reconciliation checks, identify and resolve any differences.

Required Qualifications, Capabilities, and Skills

  • Formal training or certification on data engineering concepts and 3+ years applied experience.
  • Proficiency in one or more programming languages such as Python and Java
  • Ability to design and implement scalable data pipelines for batch and real-time data processing.
  • Experience with AWS.
  • Experience working with modern data warehouse platforms like Amazon Redshift.
  • Experience in developing, debugging, and maintaining code in a large corporate environment.
  • Overall knowledge of the Software Development Life Cycle.
  • Solid understanding of agile methodologies such as CI/CD, Application Resiliency, and Security.

Preferred Qualifications, Capabilities, and Skills

  • Certifications in relevant technologies or platforms, such as AWS Certified Big Data Engineer Associate can be advantageous.
  • Relevant industry experience, preferably in a data engineering role.