ITSS-Big Data/Hadoop/Splunk Tester

ITSS-Big Data/Hadoop/Splunk Tester

Posted Today by 1761114369

Negotiable
Outside
Remote
USA

Summary: The ITSS-Big Data/Hadoop/Splunk Tester role at NTT DATA involves validating data pipelines and ensuring data integrity within large-scale distributed systems. The position requires expertise in Databricks, PySpark, and SQL, along with experience in testing ETL workflows and data quality. This is a remote position with a focus on collaboration with data engineers and analysts. The ideal candidate will have extensive experience in data testing and automation in cloud environments.

Key Responsibilities:

  • Validate end-to-end data pipelines developed in Databricks and PySpark, including data ingestion, transformation, and loading processes.
  • Develop and execute test plans, test cases, and automated scripts for validating ETL jobs and data quality across multiple stages.
  • Conduct data validation, reconciliation, and regression testing using SQL, Python, and PySpark DataFrame APIs.
  • Verify data transformations, aggregations, and schema consistency across raw, curated, and presentation layers.
  • Test Delta Lake tables for schema evolution, partitioning, versioning, and performance.
  • Collaborate with data engineers, analysts, and DevOps teams to ensure high-quality data delivery across the environment.
  • Analyze Databricks job logs, Spark execution plans, and cluster metrics to identify and troubleshoot issues.
  • Automate repetitive test scenarios and validations using Python / PySpark frameworks.
  • Participate in Agile/Scrum ceremonies, contributing to sprint planning, estimations, and defect triage.
  • Maintain clear documentation for test scenarios, execution reports, and data lineage verification.

Key Skills:

  • 8+ years of overall experience in data testing / QA within large-scale enterprise data environments.
  • 5+ years of experience in testing ETL / Big Data pipelines, validating data transformations, and ensuring data integrity.
  • 4+ years of hands-on experience with Databricks, including notebook execution, job scheduling, and workspace management.
  • 4+ years of experience in PySpark (DataFrame APIs, UDFs, transformations, joins, and data validation logic).
  • 5+ years of strong proficiency in SQL (joins, aggregations, window functions, and analytical queries) for validating complex datasets.
  • 3+ years of experience with Delta Lake or data lake testing (schema evolution, ACID transactions, time travel, partition validation).
  • 3+ years of experience in Python scripting for automation and data validation tasks.
  • 3+ years of experience with cloud-based data platforms (Azure Data Lake, AWS S3, or Google Cloud Platform BigQuery).
  • 2+ years of experience in test automation for data pipelines using tools like pytest, PySpark test frameworks, or custom Python utilities.
  • 4+ years of Strong understanding of data warehousing concepts, data modeling (Star/Snowflake), and data quality frameworks.
  • 4+ years of experience with Agile / SAFe methodologies, including story-based QA and sprint deliverables.
  • 6+ years of experience in analytical and debugging skills for identifying data mismatches, performance issues, and pipeline failures.

Salary (Rate): undetermined

City: Louisville

Country: United States

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

ITSS-Big Data/Hadoop/Splunk Tester - 25-06453
100% Remote
6Mths Duration
W2 or C2C

NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.

We are currently seeking a ITSS-Big Data/Hadoop/Splunk Tester to join our team in Louisville, Kentucky (US-KY), United States (US).

Job Description:
We are seeking an experienced Data Tester with strong expertise in Databricks, PySpark, and Big Data ecosystems. The ideal candidate will have a solid background in testing data pipelines, ETL workflows, and analytical data models, ensuring data integrity, accuracy, and performance across large-scale distributed systems.

This role requires hands-on experience with Databricks, Spark-based data processing, and strong SQL validation skills, along with familiarity in data lake / Delta Lake testing, automation, and cloud environments (AWS, Azure, or Google Cloud Platform).

Key Responsibilities:

  • Validate end-to-end data pipelines developed in Databricks and PySpark, including data ingestion, transformation, and loading processes.
  • Develop and execute test plans, test cases, and automated scripts for validating ETL jobs and data quality across multiple stages.
  • Conduct data validation, reconciliation, and regression testing using SQL, Python, and PySpark DataFrame APIs.
  • Verify data transformations, aggregations, and schema consistency across raw, curated, and presentation layers.
  • Test Delta Lake tables for schema evolution, partitioning, versioning, and performance.
  • Collaborate with data engineers, analysts, and DevOps teams to ensure high-quality data delivery across the environment.
  • Analyze Databricks job logs, Spark execution plans, and cluster metrics to identify and troubleshoot issues.
  • Automate repetitive test scenarios and validations using Python / PySpark frameworks.
  • Participate in Agile/Scrum ceremonies, contributing to sprint planning, estimations, and defect triage.
  • Maintain clear documentation for test scenarios, execution reports, and data lineage verification.

Required Qualifications:

  • 8+ years of overall experience in data testing / QA within large-scale enterprise data environments.
  • 5+ years of experience in testing ETL / Big Data pipelines, validating data transformations, and ensuring data integrity.
  • 4+ years of hands-on experience with Databricks, including notebook execution, job scheduling, and workspace management.
  • 4+ years of experience in PySpark (DataFrame APIs, UDFs, transformations, joins, and data validation logic).
  • 5+ years of strong proficiency in SQL (joins, aggregations, window functions, and analytical queries) for validating complex datasets.
  • 3+ years of experience with Delta Lake or data lake testing (schema evolution, ACID transactions, time travel, partition validation).
  • 3+ years of experience in Python scripting for automation and data validation tasks.
  • 3+ years of experience with cloud-based data platforms (Azure Data Lake, AWS S3, or Google Cloud Platform BigQuery).
  • 2+ years of experience in test automation for data pipelines using tools like pytest, PySpark test frameworks, or custom Python utilities.
  • 4+ years of Strong understanding of data warehousing concepts, data modeling (Star/Snowflake), and data quality frameworks.
  • 4+ years of experience with Agile / SAFe methodologies, including story-based QA and sprint deliverables.
  • 6+ years of experience in analytical and debugging skills for identifying data mismatches, performance issues, and pipeline failures.

Preferred Qualifications:

  • Experience with CI/CD for Databricks or data testing (GitHub Actions, Jenkins, Azure DevOps).
  • Exposure to BI validation (Power BI, Tableau, Looker) for verifying downstream reports.
  • Knowledge of REST APIs for metadata validation or system integration testing.
  • Familiarity with big data tools like Hive, Spark SQL, Snowflake, and Airflow.
  • Cloud certifications (e.g., Microsoft Azure Data Engineer Associate or AWS Big Data Specialty) are a plus.

#LI
About NTT DATA:

Where required by law, NTT DATA provides a reasonable range of compensation for specific roles. The starting hourly range for this remote role is $40 to $49. This range reflects the minimum and maximum target compensation for the position across all US locations. Actual compensation will depend on several factors, including the candidate's actual work location, relevant experience, technical skills, and other qualifications. This position may also be eligible for incentive compensation based on individual and/or company performance.
This position is eligible for company benefits that will depend on the nature of the role offered. Company benefits may include medical, dental, and vision insurance, flexible spending or health savings account, life, and AD&D insurance, short-and long-term disability coverage, paid time off, employee assistance, participation in a 401k program with company match, and additional voluntary or legally required benefits.

NTT DATA is a $30 billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long term success. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. We are one of the leading providers of digital and AI infrastructure in the world. NTT DATA is a part of NTT Group, which invests over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. Visit us at us.nttdata.com

NTT DATA endeavors to make accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at This contact information is for accommodation requests only and cannot be used to inquire about the status of applications. NTT DATA is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. For our EEO Policy Statement, please click here. If you'd like more information on your EEO rights under the law, please click here. For Pay Transparency information, please click here.