Data Engineer (Python, Spark, Pandas, SQL, GX Open Source Module)

Data Engineer (Python, Spark, Pandas, SQL, GX Open Source Module)

Posted 3 days ago by UNITECH on Linkedin

Negotiable
Undetermined
Undetermined
Edinburgh, Scotland, United Kingdom

Summary: The Data Engineer role focuses on leveraging expertise in Python, Spark, Pandas, SQL, and the Great Expectations Open Source Module to build and maintain data pipelines. The position requires ensuring data quality and supporting business intelligence needs within a dynamic team environment. The ideal candidate will be responsible for optimizing data processing and implementing validation checks. This role is essential for driving data-driven innovation within the organization.

Key Responsibilities:

  • Design, develop, and optimize scalable ETL/ELT pipelines using Python and Apache Spark.
  • Work with Pandas to process, clean, and transform large datasets efficiently.
  • Write and optimize complex SQL queries for data extraction, transformation, and analysis.
  • Implement data validation and quality checks using the Great Expectations (GX) Open Source Module.
  • Monitor and troubleshoot data pipeline performance issues, ensuring high availability and reliability.

Key Skills:

  • Strong proficiency in Python, with experience in writing efficient and scalable code for data processing.
  • Hands-on experience with Apache Spark for large-scale data processing.
  • Expertise in Pandas for data manipulation and transformation.
  • Solid understanding of SQL and relational database concepts, with experience in query optimization.
  • Experience working with Great Expectations (GX) Open Source Module for data validation and quality assurance.
  • Familiarity with cloud-based data platforms (AWS, Azure, GCP) is a plus.
  • Strong problem-solving skills and the ability to work in a fast-paced, collaborative environment.

Salary (Rate): undetermined

City: Edinburgh

Country: United Kingdom

Working Arrangements: undetermined

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT