Data Engineer

Posted 1 week ago by Hays

Apply

Negotiable

Undetermined

England, United Kingdom

Apply

Apache Spark Automation Azure Databricks Azure Data Lake Azure DevOps Cloud Computing Cloud Technology Continuous Integration and Continuous Delivery Cross-Functional Collaboration Databricks Data Engineering Data Governance Data Intelligence Data Management Data Pipeline Data Processing Data Quality Data Transformation Design Elements And Principles Device Tracking Software DevOps Extract Transform Load (ETL) Machine Learning Management Microsoft Azure Multi-Cloud Performance Testing Performance Tuning Power BI Pyspark Scheduling Software Testing Automation Framework Technical Documentation Workflows

Summary: The role of a Databricks Data Engineer involves developing and optimizing large-scale data engineering solutions within the Databricks Data Intelligence Platform. The candidate will focus on workflow orchestration, performance optimization, and data governance, utilizing tools such as PySpark, Delta Lake, and Azure services. Collaboration with cloud architects and data analysts is essential to design end-to-end workflows for analytics and machine learning. The position requires a strong technical background and experience in managing data pipelines and governance practices.

Key Responsibilities:

Design, build, and maintain robust data pipelines using Databricks notebooks, Jobs, and Workflows for batch and streaming data processing.
Optimize Spark and Delta Lake performance on Databricks clusters through efficient cluster configuration, adaptive query execution, and caching strategies.
Conduct performance testing and cluster tuning to ensure cost-efficient and high-performing workloads.
Implement data quality, lineage tracking, and access control policies aligned with Databricks Unity Catalog and data governance best practices.
Develop PySpark applications for ETL, data transformation, and analytical use cases, adhering to modular and reusable design principles.
Create and manage Delta Lake tables with a focus on ACID compliance, schema evolution, and time travel for versioned data management.
Integrate Databricks solutions with Azure services including Azure Data Lake Storage, Key Vault, and Azure Functions.
Collaborate with cloud architects and data analysts to design end-to-end workflows supporting analytics, machine learning, and reporting use cases.
Support CI/CD deployment of Databricks assets using Azure DevOps or similar automation frameworks.
Maintain detailed technical documentation on architecture, performance benchmarks, and governance configurations.

Key Skills:

In-depth knowledge of Databricks Data Intelligence Platform and multi-cloud ecosystem integration.
Experience configuring, scheduling, and monitoring Databricks Jobs and Workflows.
Strong proficiency in PySpark, including advanced data transformation, schema management, and optimization techniques.
Solid understanding of Delta Lake architecture, transactional processing, and incremental data pipeline design.
Proven ability to conduct Spark performance tuning and cluster optimization based on workload profiles.
Experience implementing fine-grained data governance with Unity Catalog, access policies, and data lineage tracking.
Hands-on experience with Azure Cloud components such as Data Lake Storage (Gen2), Key Vault, and Azure Functions.
Familiarity with CI/CD frameworks for Databricks asset deployment and environment automation.
Strong analytical and troubleshooting skills in distributed data environments.

Salary (Rate): undetermined

City: undetermined

Country: United Kingdom

Working Arrangements: undetermined

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

We are looking for a Databricks Data Engineer with strong expertise in developing and optimizing large-scale data engineering solutions within the Databricks Data Intelligence Platform. The ideal candidate will have practical experience in workflow orchestration, performance optimization, and data governance, alongside broad proficiency in PySpark, Delta Lake, and Azure services.

Key Responsibilities:

Design, build, and maintain robust data pipelines using Databricks notebooks, Jobs, and Workflows for batch and streaming data processing.
Optimize Spark and Delta Lake performance on Databricks clusters through efficient cluster configuration, adaptive query execution, and caching strategies.
Conduct performance testing and cluster tuning to ensure cost-efficient and high-performing workloads.
Implement data quality, lineage tracking, and access control policies aligned with Databricks Unity Catalog and data governance best practices.
Develop PySpark applications for ETL, data transformation, and analytical use cases, adhering to modular and reusable design principles.
Create and manage Delta Lake tables with a focus on ACID compliance, schema evolution, and time travel for versioned data management.
Integrate Databricks solutions with Azure services including Azure Data Lake Storage, Key Vault, and Azure Functions.
Collaborate with cloud architects and data analysts to design end-to-end workflows supporting analytics, machine learning, and reporting use cases.
Support CI/CD deployment of Databricks assets using Azure DevOps or similar automation frameworks.
Maintain detailed technical documentation on architecture, performance benchmarks, and governance configurations.

Required Skills and Experience:

In-depth knowledge of Databricks Data Intelligence Platform and multi-cloud ecosystem integration.
Experience configuring, scheduling, and monitoring Databricks Jobs and Workflows.
Strong proficiency in PySpark, including advanced data transformation, schema management, and optimization techniques.
Solid understanding of Delta Lake architecture, transactional processing, and incremental data pipeline design.
Proven ability to conduct Spark performance tuning and cluster optimization based on workload profiles.
Experience implementing fine-grained data governance with Unity Catalog, access policies, and data lineage tracking.
Hands-on experience with Azure Cloud components such as Data Lake Storage (Gen2), Key Vault, and Azure Functions.
Familiarity with CI/CD frameworks for Databricks asset deployment and environment automation.
Strong analytical and troubleshooting skills in distributed data environments.

Preferred Qualifications:

Experience supporting enterprise-scale Databricks environments with multiple workspaces and governed catalogs.
Knowledge of Azure Synapse, Power BI, or related analytics services.
Understanding of cost optimization strategies for data compute on Databricks clusters.
Excellent problem-solving skills, technical communication, and cross-functional collaboration abilities.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)