Negotiable
Outside
Hybrid
USA
Summary: The Lead Data Platform Engineer role involves a hybrid focus on 60% administration and 40% development/support, aimed at scaling data and DataOps infrastructure. The position requires expertise in technologies such as Databricks, Apache Spark, and AWS, with a strong emphasis on solving complex data challenges. The ideal candidate will have hands-on experience in Python and AWS, contributing to mission-critical data pipelines and integrations. This role also includes providing technical leadership and mentorship to junior developers and third-party teams.
Key Responsibilities:
- Design, develop, and maintain scalable ETL pipelines and integration frameworks.
- Administer and optimize Databricks and Apache Spark environments for data engineering workloads.
- Build and manage data workflows using AWS services such as Lambda, Glue, Redshift, SageMaker, and S3.
- Support and troubleshoot DataOps pipelines, ensuring reliability and performance across environments.
- Automate platform operations using Python, PySpark, and infrastructure-as-code tools.
- Collaborate with cross-functional teams to support data ingestion, transformation, and deployment.
- Provide technical leadership and mentorship to junior developers and third-party teams.
- Create and maintain technical documentation and training materials.
- Troubleshoot recurring issues and implement long-term resolutions.
Key Skills:
- Bachelor's or Master's degree in Computer Science or related field.
- 5+ years of experience in data engineering or platform administration.
- 3+ years of experience in integration framework development with a strong emphasis on Databricks, AWS, and ETL.
- Strong programming skills in Python and PySpark.
- Expertise in Databricks, Apache Spark, and Delta Lake.
- Proficiency in AWS CloudOps, Cloud Security, including configuration, deployment, and monitoring.
- Experience in real-time streaming pipelines development, distributed data processing, and data orchestration.
- Knowledge of various data processing techniques and technologies including Kafka, Neo4j, MongoDB, PostgreSQL, and more.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: hybrid
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Lead Data Platform Engineer
Location: Miramar, Dallas or Remote
Sr. Data Platform Engineer who thrives in a hybrid role 60% administration and 40% development/support to help us scale our data and DataOps infrastructure. You ll work with cutting-edge technologies like Databricks, Apache Spark, Delta Lake, and AWS CloudOps, Cloud Security, while supporting mission-critical data pipelines and integrations. If you re a hands-on engineer with strong Python skills, deep AWS experience, and a knack for solving complex data challenges, we want to hear from you.
Key Responsibilities Design, develop, and maintain scalable ETL pipelines and integration frameworks. Administer and optimize Databricks and Apache Spark environments for data engineering workloads. Build and manage data workflows using AWS services such as Lambda, Glue, Redshift, SageMaker, and S3. Support and troubleshoot DataOps pipelines, ensuring reliability and performance across environments. Automate platform operations using Python, PySpark, and infrastructure-as-code tools. Collaborate with cross-functional teams to support data ingestion, transformation, and deployment. Provide technical leadership and mentorship to junior developers and third-party teams. Create and maintain technical documentation and training materials. Troubleshoot recurring issues and implement long-term resolutions.
Minimum Qualifications Bachelor s or Master s degree in Computer Science or related field. 5+ years of experience in data engineering or platform administration. 3+ years of experience in integration framework development with a strong emphasis on Databricks, AWS, and ETL.
Required Technical Skills Strong programming skills in Python and PySpark. Expertise in Databricks, Apache Spark, and Delta Lake. Proficiency in AWS CloudOps, Cloud Security, including configuration, deployment, and mon
Meta driven, real-time streaming pipelines development, Distributed data processing, Data Orchestration, Unstructured/Semi/Structured data processing, Flexible data modeling and Semantic Engineering, Knowledge graphs, Data-as-a-service, Data Observability, Quality frameworks, Data Syndication, Data Fabric development, Data Market place development, Cognitive search engine development, Master data management, Data Governance, Data Migration; Experience in specific tech stacks include: Kafka, Databricks, DeltaLake, Python, Pandas, Spark, pySpark, AirFlow, Neo4j, GraphDB, MongoDB, PostgreSQL, OWL, Python functions, New Relic, Grafana, OpenLineage, Apache atlas, Databricks Unity catalog, DLT, Great Expectations, Databricks Delta-sharing