DataStage Engineer (Google Cloud Platform)

DataStage Engineer (Google Cloud Platform)

Posted 1 day ago by 1751356036

Negotiable
Outside
Remote
USA

Summary: The DataStage Engineer role focuses on building data pipelines using Google Cloud Platform technologies, specifically DataStage and SnapLogic, within a healthcare context. Candidates are expected to have hands-on experience in developing ETL processes from scratch rather than merely maintaining existing workflows. The position emphasizes proactive communication and problem-solving skills, as well as familiarity with data warehousing and orchestration tools. This is a remote contract position with a potential for hire after 4-6 months.

Key Responsibilities:

  • Analyze existing ETL pipelines and jobs, migrating them to a modern stack including SnapLogic, Python, Spark, and Dataflow.
  • Develop new data ingestion and ETL pipelines from scratch, primarily using SnapLogic, along with Python, SQL, Dataflow, and Spark.
  • Support data modeling efforts without owning them, requiring some experience in data modeling and data warehousing fundamentals.
  • Utilize Airflow or Cloud Composer for orchestration and development of new DAGs.
  • Communicate proactively and solve problems, contributing suggestions and asking questions.
  • Work with a variety of technologies in the environment, with familiarity in Kafka, Java, Apache Beam, and Alteryx as a plus.

Key Skills:

  • Experience in DataStage and SnapLogic.
  • Proficiency in Python, SQL, Dataflow, and Spark.
  • Knowledge of data warehousing and Google BigQuery.
  • Experience with Airflow or Cloud Composer for orchestration.
  • Strong communication and problem-solving skills.
  • Familiarity with additional technologies such as Kafka, Java, Apache Beam, and Alteryx is a plus.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

DataStage Engineer (Google Cloud Platform) Remote 4-6 month contract to hire
DataStage, Google Cloud Platform and Snaplogic required
Healthcare background preferred Pain Points: candidates who do not have the correct focus area (building the data pipelines), candidates who have not built something from scratch; either they are maintaining existing data workflows or processes in production, but not building.

  • 2 Developers need experience in Datastage--have legacy ETL pipelines which they are migrating. This role will analyze the existing ETL pipelines, jobs, and replace in their modern stack: SnapLogic, Python, some Spark, some Dataflow.
  • These 2 Developers need experience in DataStage (they have not been successful in upskilling from Informatica, Talend, etc.).
  • Experience in SnapLogic is strongly preferred; they have found this to be more amenable to upskilling.
  • Needs experience in Airflow or Cloud Composer orchestration, development of new DAGs from scratch.
  • Development of data ingestion and ETL pipelines from scratch. Using SnapLogic primarily for data pipelines and integrations, but also Python, SQL, Dataflow, Spark.
  • Needs experience in data warehousing, Google BigQuery.
  • Not responsible for building out visualizations; another team handles
  • Will be supporting data modeling, but not owning. Should have some experience in data modeling, data warehousing fundamentals.
  • Understanding of analytics as a whole, how data moves from source, warehouse, semantic or reporting layer, models, and reporting/BI, but their hands-on focus will be around building data pipelines and orchestration.
  • Proactive communicators, inquisitive people, problem-solvers, unafraid to make suggestions, ask questions. "Order taker" and "heads down" types of Engineers will not be a culture fit for the team.
  • They have a list of other technologies in their environment in smaller amounts/more dispersed--any would be a "nice to have": i.e. Kafka, Java, Apache Beam, Alteryx, etc.