Negotiable
Inside
Undetermined
Scotland, UK
Summary: The Data Engineer role at Whitehall Resources involves designing and implementing efficient ETL processes using Python and Databricks, while ensuring data accuracy and consistency. The position requires collaboration with cross-functional teams and ownership of the end-to-end engineering lifecycle. Candidates should have extensive experience in developing data pipelines and working with cloud services, particularly Databricks and Snowflake. This role is classified as inside IR35, necessitating the use of an FCSA Accredited Umbrella Company.
Key Responsibilities:
- Collaborating with cross-functional teams to understand data requirements and design efficient, scalable, and reliable ETL processes using Python and Databricks.
- Developing and deploying ETL jobs that extract data from various sources, transforming them to meet business needs.
- Taking ownership of the end-to-end engineering lifecycle, including data extraction, cleansing, transformation, and loading, ensuring accuracy and consistency.
- Creating and managing data pipelines, ensuring proper error handling, monitoring, and performance optimizations.
- Working in an agile environment, participating in sprint planning, daily stand-ups, and retrospectives.
- Conducting code reviews, providing constructive feedback, and enforcing coding standards to maintain a high quality.
- Developing and maintaining tooling and automation scripts to streamline repetitive tasks.
- Implementing unit, integration, and other testing methodologies to ensure the reliability of the ETL processes.
- Utilizing REST APIs and other integration techniques to connect various data sources.
- Maintaining documentation, including data flow diagrams, technical specifications, and processes.
- Designing and implementing tailored data solutions to meet customer needs and use cases, spanning from streaming to data lakes, analytics, and beyond within a dynamically evolving technical stack.
- Collaborating seamlessly across diverse technical stacks, including Databricks, Snowflake, etc.
- Developing various components in Python as part of a unified data pipeline framework.
- Contributing towards the establishment of best practices for the optimal and efficient usage of data across various on-prem and cloud platforms.
- Assisting with the testing and deployment of our data pipeline framework utilizing standard testing frameworks and CI/CD tooling.
- Monitoring the performance of queries and data loads and performing tuning as necessary.
- Providing assistance and guidance during QA & UAT phases to quickly confirm the validity of potential issues and to determine the root cause and best resolution of verified issues.
- Adhering to Agile practices throughout the solution development process.
- Designing, building, and deploying databases and data stores to support organizational requirements.
Key Skills:
- 4+ years of experience developing data pipelines and data warehousing solutions using Python and libraries such as Pandas, NumPy, PySpark, etc.
- 3+ years hands-on experience with cloud services, especially Databricks, for building and managing scalable data pipelines.
- 3+ years of proficiency in working with Snowflake or similar cloud-based data warehousing solutions.
- 3+ years of experience in data development and solutions in highly complex data environments with large data volumes.
- Solid understanding of ETL principles, data modelling, data warehousing concepts, and data integration best practices.
- Familiarity with agile methodologies and the ability to work collaboratively in a fast-paced, dynamic environment.
- Experience with code versioning tools (eg, Git).
- Knowledge of Linux operating systems.
- Familiarity with REST APIs and integration techniques.
- Familiarity with data visualization tools and libraries (eg, Power BI).
- Background in database administration or performance tuning.
- Familiarity with data orchestration tools, such as Apache Airflow.
- Previous exposure to big data technologies (eg, Hadoop, Spark) for large data processing.
- Strong analytical skills, including a thorough understanding of how to interpret customer business requirements and translate them into technical designs and solutions.
- Strong communication skills both verbal and written.
- Self-starter with proven ability to manage multiple, concurrent projects with minimal supervision.
- Strong problem-solving skills.
Salary (Rate): undetermined
City: Scotland
Country: UK
Working Arrangements: undetermined
IR35 Status: inside IR35
Seniority Level: undetermined
Industry: IT
Data Engineer
Whitehall Resources are currently looking for a Data Engineer.
This role will be Inside of IR35, so you will be required to use an FCSA Accredited Umbrella Company.
Key Requirements:
- Collaborating with cross-functional teams to understand data requirements, and design efficient, scalable, and reliable ETL processes using Python and Databricks.
- Developing and deploying ETL jobs that extract data from various sources, transforming them to meet business needs.
- Taking ownership of the end-to-end engineering lifecycle, including data extraction, cleansing, transformation, and loading, ensuring accuracy and consistency.
- Creating and managing data pipelines, ensuring proper error handling, monitoring and performance optimizations.
- Working in an agile environment, participating in sprint planning, daily stand-ups, and retrospectives.
- Conducting code reviews, providing constructive feedback, and enforcing coding standards to maintain a high quality.
- Developing and maintaining tooling and automation scripts to streamline repetitive tasks.
- Implementing unit, integration, and other testing methodologies to ensure the reliability of the ETL processes.
- Utilizing REST APIs and other integration techniques to connect various data sources.
- Maintaining documentation, including data flow diagrams, technical specifications, and processes.
- Designing and implementing tailored data solutions to meet customer needs and use cases, spanning from streaming to data lakes, analytics, and beyond within a dynamically evolving technical stack.
- Collaborate seamlessly across diverse technical stacks, including Databricks, Snowflake, etc.
- Developing various components in Python as part of a unified data pipeline framework.
- Contributing towards the establishment of best practices for the optimal and efficient usage of data across various on-prem and cloud platforms.
- Assisting with the testing and deployment of our data pipeline framework utilizing standard testing frameworks and CI/CD tooling.
- Monitoring the performance of queries and data loads and perform tuning as necessary.
- Providing assistance and guidance during QA & UAT phases to quickly confirm the validity of potential issues and to determine the root cause and best resolution of verified issues.
- Adhere to Agile practices throughout the solution development process.
- Design, build, and deploy databases and data stores to support organizational requirements.
Key Experience:
- 4+ years of experience developing data pipelines and data warehousing solutions using Python and libraries such as Pandas, NumPy, PySpark, etc.
- 3+ years hands-on experience with cloud services, especially Databricks, for building and managing scalable data pipelines.
- 3+ years of proficiency in working with Snowflake or similar cloud-based data warehousing solutions.
- 3+ years of experience in data development and solutions in highly complex data environments with large data volumes.
- Solid understanding of ETL principles, data modelling, data warehousing concepts, and data integration best practices-Familiarity with agile methodologies and the ability to work collaboratively in a fast-paced, dynamic environment.
- Experience with code versioning tools (eg, Git).
- Knowledge of Linux operating systems.
- Familiarity with REST APIs and integration techniques.
- Familiarity with data visualization tools and libraries (eg, Power BI).
- Background in database administration or performance tuning.
- Familiarity with data orchestration tools, such as Apache Airflow.
- Previous exposure to big data technologies (eg, Hadoop, Spark) for large data processing.
- Strong analytical skills, including a thorough understanding of how to interpret customer business requirements and translate them into technical designs and solutions.
- Strong communication skills both verbal and written. Capable of collaborating effectively across a variety of IT and Business groups, across regions, roles and able to interact effectively with all levels.
- Self-starter. Proven ability to manage multiple, concurrent projects with minimal supervision. Can manage a complex ever changing priority list and resolve conflicts to competing priorities.
- Strong problem-solving skills. Ability to identify where focus is needed and bring clarity to business objectives, requirements, and priorities.
All of our opportunities require that applicants are eligible to work in the specified country/location, unless otherwise stated in the job description.
Whitehall Resources are an equal opportunities employer who value a diverse and inclusive working environment. All qualified applicants will receive consideration for employment without regard to race, religion, gender identity or expression, sexual orientation, national origin, pregnancy, disability, age, veteran status, or other characteristics.