Negotiable
Undetermined
Hybrid
London Area, United Kingdom
Summary: The Senior Data Engineer will lead the full-scale build of a next-generation data platform utilizing Databricks and Apache Spark. This role involves migrating an existing on-prem Data Lake to a Databricks-based architecture while building ETL pipelines in PySpark and implementing best practices. The engineer will collaborate with various teams across trading and finance to ensure successful platform rollout. Strong experience in Spark/Databricks implementations and a DevOps-first approach are essential for this position.
Key Responsibilities:
- Lead the full-scale build of a data platform using Databricks and Apache Spark.
- Migrate existing on-prem Data Lake to a Databricks-based architecture.
- Build ETL pipelines in PySpark.
- Implement platform best practices.
- Collaborate with teams across trading and finance.
- Set up data platforms with a DevOps-first approach (IaC, CI/CD, automation).
- Work with AWS-based environments and private cloud storage.
Key Skills:
- Strong experience delivering Spark/Databricks implementations in a lead or senior role.
- Solid hands-on background in PySpark, Spark, and Python.
- Experience setting up data platforms with a DevOps-first approach.
- Exposure to AWS-based environments.
- Familiarity with financial/trading systems and regulated industries.
Salary (Rate): undetermined
City: London
Country: United Kingdom
Working Arrangements: hybrid
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Senior Data Engineer – Spark & Databricks Platform Build
Location: London (2 days a week on site)
Contract: 6-month sign-on
Interview process: 2 stages
twentyAI’s customer is building the next generation of their data platform, with Databricks and Apache Spark at the core. A PoC is already in place — now they’re looking for someone who’s done this before to lead the full-scale build and help roll the platform out across the organisation. The project involves migrating the existing on-prem Data Lake to a Databricks-based architecture, while continuing to leverage private cloud storage. You’ll be laying the foundations: building ETL pipelines in PySpark, implementing platform best practices, and working closely with teams across trading and finance.
Profile:
- Strong experience delivering Spark/Databricks implementations in a lead or senior role
- Solid hands-on background in PySpark, Spark, and Python
- Experience setting up data platforms with a DevOps-first approach (IaC, CI/CD, automation)
- Exposure to AWS-based environments
- Familiarity with financial/trading systems and working in regulated industries (e.g., banking, commodities)
Tech Environment:
Primary Platform: Databricks, Apache Spark
Other Tech: DBT, Airflow, Python, PySpark
Cloud: AWS (preferred), private cloud storage
Data Sources: Financial/trading systems