Negotiable
Outside
Remote
USA
Summary: The Senior Data Engineer role focuses on leading the implementation of modern data solutions using Databricks to enhance research and operations at Tevogen Bio. The position requires extensive experience in Big Data Engineering, particularly with Apache Spark and Databricks, to design and manage scalable data pipelines. Collaboration with various teams and providing guidance on cloud architecture and best practices are key aspects of the role. This is a remote position with a focus on optimizing data processes and ensuring data governance.
Key Responsibilities:
- Lead implementation of modern data solutions on Databricks to support Tevogen Bio's research and operations.
- Design and manage scalable batch and streaming data pipelines using Delta Lake, declarative pipelines, and Autoloader for real-time ingestion.
- Implement data governance, access control, and lineage using Unity Catalog.
- Optimize ETL/ELT processes and Spark workloads for high performance.
- Collaborate with data engineers, analysts, and scientists to ensure business and technical alignment.
- Provide guidance on cloud data architecture and CI/CD best practices.
Key Skills:
- 8-10 years in Big Data Engineering with Apache Spark; 3-4 years hands-on with Databricks.
- Expertise in Delta Lake, declarative pipelines, batch/streaming data integration and Autoloader.
- Proficient in PySpark, SQL, Spark optimization, and data modeling.
- Experience in cloud platforms and DevOps/CI/CD practices.
- Strong communication and stakeholder management skills.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Position: Senior Data Engineer - Databricks/Databricks Lead
Remote
Exp: 12+yrs
Key Responsibilities
- Lead implementation of modern data solutions on Databricks to support Tevogen Bio s research and operations.
- Design and manage scalable batch and streaming data pipelines using Delta Lake, declarative pipelines, and Autoloader for real-time ingestion.
- Implement data governance, access control, and lineage using Unity Catalog.
- Optimize ETL/ELT processes and Spark workloads for high performance.
- Collaborate with data engineers, analysts, and scientists to ensure business and technical alignment.
- Provide guidance on cloud data architecture and CI/CD best practices.
Required Skills & Experience
- 8 10 years in Big Data Engineering with Apache Spark; 3 4 years hands-on with Databricks.
- Expertise in Delta Lake, declarative pipelines, batch/streaming data integration and Autoloader.
- Proficient in PySpark, SQL, Spark optimization, and data modeling.
- Experience in cloud platforms and DevOps/CI/CD practices.
- Strong communication and stakeholder management skills.