Databricks Engineer

Posted Today by 1764149463

Apply

Negotiable

Outside

Remote

USA

Apply

Agile Methodology Amazon S3 Amazon Web Services (AWS) Anomaly Detection Apache Spark Application Programming Interface (API) Artificial Intelligence Automation Azure Databricks Azure Data Lake Backbone.js Cloud Computing Cloud Technology Data Architecture Databricks Data Encryption Data Engineering Data Flow Diagram Data Governance Data Ingestion Data Integrity Data Lakes Data Layers Data Masking Data Quality Data Security Data Store Data Warehousing Dependency Management Device Tracking Software Encryption External Auditing Extract Transform Load (ETL) Grafana Java Database Connectivity Machine Learning Management Metadata Microsoft Azure Milestones (Project Management) Operational Excellence PeopleSoft Customer Relationship Management (CRM) Predictive Modeling Python (Programming Language) Salesforce Scala (Programming Language) Scheduling SQL (Programming Language) Stakeholder Management Strategic Decision-Making Technical Documentation Tokenisation Unstructured Data Verification And Validation Workflows

Summary: The Databricks Engineer role involves designing, building, and operating a Data & AI platform based on the Medallion Architecture. The engineer will orchestrate complex data workflows and scalable ELT pipelines, integrating data from various enterprise systems to support machine learning and analytics. This position is critical for ensuring seamless data flow and operational excellence across the organization. The role requires hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.

Key Responsibilities:

Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake.
Operationalize Databricks Workflows for orchestration and pipeline automation.
Connect and ingest data from enterprise systems using APIs and other integration frameworks.
Develop data quality checks and integrate monitoring tools to ensure data integrity.
Enforce data security best practices and implement compliance measures.
Enable data scientists by delivering high-quality data sets for model training.
Architect and manage data lakes and optimize data storage for performance.
Maintain technical documentation and provide training on the Databricks platform.
Submit weekly progress reports and track deliverables against roadmap milestones.

Key Skills:

Hands-on experience with Databricks, Delta Lake, and Apache Spark.
Deep understanding of ELT pipeline development and orchestration.
Experience implementing Medallion Architecture and data versioning.
Strong proficiency in SQL, Python, or Scala.
Proven experience integrating enterprise platforms into centralized data platforms.
Familiarity with data governance and metadata management tools.
Experience with Databricks Unity Catalog and MLOps tools.
Knowledge of cloud platforms like Azure or AWS.
Understanding of data warehouse design and schema modeling.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

4. Databricks Engineer

We are seeking a Databricks Engineer to design, build, and operate a Data & AI platform with a strong foundation in the Medallion Architecture (raw/bronze, curated/silver, and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft, D2L, and Salesforce, delivering high-quality, governed data for machine learning, AI/BI, and analytics at scale.

You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise, ensure operational excellence, and provide the backbone for strategic decision-making, predictive modeling, and innovation.

Responsibilities:

1. Data & AI Platform Engineering (Databricks-Centric):

Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.

Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.

Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.

Apply schema evolution and data versioning to support agile data development.

2. Platform Integration & Data Ingestion:

Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.

Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.

Design standardized data ingestion processes with automated error handling, retries, and alerting.

3. Data Quality, Monitoring, and Governance:

Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.

Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.

Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement.

4. Security, Privacy, and Compliance:

Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.

Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).

Work with security teams to audit and certify compliance controls.

5. AI/ML-Ready Data Foundation:

Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.

Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.

Collaborate with AI/ML teams to create reusable feature stores and training pipelines.

6. Cloud Data Architecture and Storage:

Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer.

Build data marts and warehousing solutions using platforms like Databricks.

Optimize data storage and access patterns for performance and cost-efficiency.

7. Documentation & Enablement:

Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.

Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices.

Conduct code reviews and promote reusable patterns and frameworks across teams.

8. Reporting and Accountability:

Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.

Track deliverables against roadmap milestones and communicate risks or dependencies.

Required Qualifications:

Hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.

Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.

Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.

Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.

Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.

Familiarity with data governance, lineage tracking, and metadata management tools.

Preferred Qualifications:

Experience with Databricks Unity Catalog for metadata management and access control.

Experience deploying ML models at scale using MLFlow or similar MLOps tools.

Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects.

Knowledge of data warehouse design and star/snowflake schema modeling.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)