£630 Per day
Inside
Remote
England, UK
Summary: The role of a PySpark Data Engineer involves developing and optimizing PySpark batch pipelines for processing Parquet data and utilizing Delta Lake for input/output operations. The position requires implementing validation and billing logic within the code, ensuring performance tuning, and integrating with orchestrators and CI/CD pipelines. The role is remote and classified as inside IR35, with a contract duration of over two months. Candidates should ideally possess CTC clearance and relevant industry knowledge in Azure technologies.
Key Responsibilities:
- Develop and optimise PySpark batch pipelines that process Parquet data and use Delta Lake for all IO, applying validation, enrichment, and billing calculation logic directly in PySpark code.
- Build reliable PySpark jobs that read/write Delta tables on ADLS Gen2.
- Implement in-code validations (schema, null/range/value checks, referential lookups), routing rejects to dedicated Delta "quarantine" tables.
- Design and implement billing logic - tariff/charge models, tiered pricing, pro-rata handling, VAT/discounts, adjustments, and full auditability.
- Externalise billing and validation rules via versioned JSON configs, ensuring deterministic, idempotent re-runs.
- Optimise Delta operations (MERGE, OPTIMIZE, Z-ORDER, VACUUM) and incremental/CDC merges into Azure SQL.
- Tune performance (partitioning, caching, broadcast joins) and maintain robust retries, checkpoints, and structured logging.
- Integrate with orchestrators (ADF or Container App Orchestrator) and CI/CD pipelines (GitHub Actions).
- Operate securely within private-network Azure environments (Managed Identity, RBAC, Private Endpoints).
Key Skills:
- PySpark with Delta Lake (structured APIs, MERGE, schema evolution).
- Solid knowledge of Azure Synapse Spark pools or Databricks, ADLS Gen2, and Azure SQL.
- Strong engineering discipline: observability, retries, cost and performance optimisation.
- Great Expectations (for supplementary DQ checks).
- Familiarity with ADF orchestration and containerised Spark workloads.
Salary (Rate): £630 daily
City: undetermined
Country: UK
Working Arrangements: remote
IR35 Status: inside IR35
Seniority Level: undetermined
Industry: IT
Detailed Description From Employer:
PySpark Data Engineer - 2 months+ £600-630pd Inside IR35- Remote
Ideally looking for someone who is CTC Cleared
- Develop and optimise PySpark batch pipelines that process Parquet data and use Delta Lake for all IO, applying validation, enrichment, and billing calculation logic directly in PySpark code.
- Build reliable PySpark jobs that read/write Delta tables on ADLS Gen2.
- Implement in-code validations (schema, null/range/value checks, referential lookups), routing rejects to dedicated Delta "quarantine" tables.
- Design and implement billing logic - tariff/charge models, tiered pricing, pro-rata handling, VAT/discounts, adjustments, and full auditability.
- Externalise billing and validation rules via versioned JSON configs, ensuring deterministic, idempotent re-runs.
- Optimise Delta operations (MERGE, OPTIMIZE, Z-ORDER, VACUUM) and incremental/CDC merges into Azure SQL.
- Tune performance (partitioning, caching, broadcast joins) and maintain robust retries, checkpoints, and structured logging.
- Integrate with orchestrators (ADF or Container App Orchestrator) and CI/CD pipelines (GitHub Actions).
- Operate securely within private-network Azure environments (Managed Identity, RBAC, Private Endpoints).
Required Industry Knowledge and Competencies
- PySpark with Delta Lake (structured APIs, MERGE, schema evolution).
Solid knowledge of Azure Synapse Spark pools or Databricks, ADLS Gen2, and Azure SQL. - Strong engineering discipline: observability, retries, cost and performance optimisation.
- Great Expectations (for supplementary DQ checks).
- Familiarity with ADF orchestration and containerised Spark workloads.
PySpark Data Engineer - 2 months+ £600-630pd Inside IR35- Remote
Damia Group Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept our Data Protection Policy which can be found on our website.
Please note that no terminology in this advert is intended to discriminate on the grounds of a person's gender, marital status, race, religion, colour, age, disability or sexual orientation. Every candidate will be assessed only in accordance with their merits, qualifications and ability to perform the duties of the job.
Damia Group is acting as an Employment Business in relation to this vacancy and in accordance to Conduct Regulations 2003.