Senior Data Engineer (Distributed Data Processing)

Senior Data Engineer (Distributed Data Processing)

Posted Today by Xcede

Negotiable
Undetermined
Remote
Great Work, England, United Kingdom

Summary: The Senior Data Engineer role focuses on distributed data processing within a data-intensive SaaS platform in a regulated industry. This hands-on position requires expertise in Spark-based pipelines and Python engineering, with responsibilities including the design and optimization of large-scale data workflows. The role also involves mentoring other engineers and collaborating on client-facing projects. It is not related to machine learning or data science.

Key Responsibilities:

  • Design, build, and evolve large-scale distributed data pipelines using Spark / PySpark.
  • Develop production-grade Python data workflows that implement complex business logic.
  • Work with Databricks for job execution, orchestration, and optimisation.
  • Own and optimise cloud-based data infrastructure (AWS preferred, Azure also relevant).
  • Optimise data workloads for performance, reliability, and cost.
  • Collaborate with engineers, domain specialists, and delivery teams on client-facing projects.
  • Take ownership of technical initiatives and lead by example within the team.
  • Support and mentor other engineers.

Key Skills:

  • Proven experience as a Senior Data Engineer.
  • Strong Python software engineering foundation.
  • Hands-on Spark experience in production (PySpark essential).
  • Real-world experience using Databricks for data pipelines (Spark depth matters most).
  • Experience with large-scale or parallel data processing.
  • Ownership of cloud infrastructure (AWS and/or Azure).
  • Comfortable operating with senior-level autonomy and responsibility.
  • Experience mentoring or supporting other engineers.

Salary (Rate): undetermined

City: Great Work

Country: United Kingdom

Working Arrangements: remote

IR35 Status: undetermined

Seniority Level: Senior

Industry: IT

Detailed Description From Employer:

Senior Data Engineer (Distributed Data Processing) UK (O/IR35), Belgium, Netherlands or Germany (B2B) Fully Remote

We’re looking for a Senior Data Engineer to join a data-intensive SaaS platform operating in a complex, regulated industry. This is a hands-on senior IC role focused on distributed data processing, Spark-based pipelines, and Python-heavy engineering. You’ll be working on large-scale batch data workflows that power pricing, forecasting, and operational decision-making systems. The role requires strong engineering judgement, the ability to operate autonomously, and the confidence to mentor others while delivering under tight timelines. This is not an ML, Data Science, or GenAI role.

What You’ll Be Doing

  • Design, build, and evolve large-scale distributed data pipelines using Spark / PySpark.
  • Develop production-grade Python data workflows that implement complex business logic.
  • Work with Databricks for job execution, orchestration, and optimisation.
  • Own and optimise cloud-based data infrastructure (AWS preferred, Azure also relevant).
  • Optimise data workloads for performance, reliability, and cost.
  • Collaborate with engineers, domain specialists, and delivery teams on client-facing projects.
  • Take ownership of technical initiatives and lead by example within the team.
  • Support and mentor other engineers.

Must-Have Experience

  • Proven experience as a Senior Data Engineer.
  • Strong Python software engineering foundation.
  • Hands-on Spark experience in production (PySpark essential).
  • Real-world experience using Databricks for data pipelines (Spark depth matters most).
  • Experience with large-scale or parallel data processing.
  • Ownership of cloud infrastructure (AWS and/or Azure).
  • Comfortable operating with senior-level autonomy and responsibility.
  • Experience mentoring or supporting other engineers.

Nice-to-Have Experience

  • Experience working with time-series data.
  • Background in utilities, energy, or other data-heavy regulated industries.
  • Exposure to streaming technologies (Kafka, event-driven systems), though the role is primarily batch-focused.