BigID Developer (Python + NLP SpaCY)

BigID Developer (Python + NLP SpaCY)

Posted 2 weeks ago by 1750858377

Negotiable
Outside
Hybrid
USA

Summary: We are looking for a BigID Developer with 6-8 years of experience to join our data privacy and discovery team. The candidate should have expertise in the BigID platform, Regular Expressions, Python scripting, and NLP frameworks like SpaCy. This role involves automating data classification, building intelligent data models, and ensuring compliance with privacy regulations. The position can be performed remotely or in a hybrid setup.

Key Responsibilities:

  • Implement and customize BigID for enterprise-wide data discovery, classification, and privacy enforcement.
  • Create and optimize RegEx patterns for custom data identification rules.
  • Develop and integrate Python scripts for automation, data parsing, and API workflows.
  • Leverage SpaCy NLP models for intelligent entity recognition, sensitive data detection, and unstructured data classification.
  • Extend BigID capabilities using custom logic and advanced data classification techniques.
  • Connect and configure diverse data sources (databases, data lakes, SaaS, file systems) in BigID for scan orchestration.
  • Collaborate with data governance and security teams to align with compliance frameworks (GDPR, CCPA, HIPAA).
  • Monitor BigID job performance, resolve system issues, and fine-tune NLP-based classification logic.
  • Prepare detailed documentation, workflows, and operational handbooks.

Key Skills:

  • 6-8 years of professional experience in data engineering, data governance, or privacy engineering roles.
  • 2+ years of hands-on experience with BigID implementation, policy configuration, and data source integration.
  • Strong command of Regular Expressions (RegEx) for sensitive data discovery patterns.
  • Advanced Python scripting skills including data manipulation, API integration, and automation.
  • Proficiency with SpaCy or similar NLP tools (e.g., NLTK, Transformers) for entity recognition and unstructured data processing.
  • Familiarity with REST APIs, JSON, and data ingestion pipelines.
  • Experience working with structured/unstructured data across cloud and on-prem platforms (e.g., AWS S3, Azure Blob, Google Cloud Platform, SQL/NoSQL databases).

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: hybrid

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Job Title: BigID Developer (RegEx & Python
Location: San Ramon, CA / Remote
Duration: 3+ Months
Exp. Level: 6-8 years

Position Summary:

We are seeking a highly skilled BigID Developer with 4 6 years of experience to join our data privacy and discovery team. The ideal candidate will have expertise in the BigID platform, Regular Expressions (RegEx), Python scripting, and hands-on experience with NLP frameworks such as SpaCy. You will play a key role in automating data classification, building intelligent data models, and ensuring compliance with global privacy regulations.


Key Responsibilities:

  • Implement and customize BigID for enterprise-wide data discovery, classification, and privacy enforcement.
  • Create and optimize RegEx patterns for custom data identification rules.
  • Develop and integrate Python scripts for automation, data parsing, and API workflows.
  • Leverage SpaCy NLP models for intelligent entity recognition, sensitive data detection, and unstructured data classification.
  • Extend BigID capabilities using custom logic and advanced data classification techniques.
  • Connect and configure diverse data sources (databases, data lakes, SaaS, file systems) in BigID for scan orchestration.
  • Collaborate with data governance and security teams to align with compliance frameworks (GDPR, CCPA, HIPAA).
  • Monitor BigID job performance, resolve system issues, and fine-tune NLP-based classification logic.
  • Prepare detailed documentation, workflows, and operational handbooks.

Required Skills & Experience:

  • 6-8 years of professional experience in data engineering, data governance, or privacy engineering roles.
  • 2+ years of hands-on experience with BigID implementation, policy configuration, and data source integration.
  • Strong command of Regular Expressions (RegEx) for sensitive data discovery patterns.
  • Advanced Python scripting skills including data manipulation, API integration, and automation.
  • Proficiency with SpaCy or similar NLP tools (e.g., NLTK, Transformers) for entity recognition and unstructured data processing.
  • Familiarity with REST APIs, JSON, and data ingestion pipelines.
  • Experience working with structured/unstructured data across cloud and on-prem platforms (e.g., AWS S3, Azure Blob, Google Cloud Platform, SQL/NoSQL databases).

Nice to Have:

  • Experience with BigID App Framework or BigID Studio for building custom connectors or workflows.
  • Exposure to AI/ML-driven data classification or custom NLP models.
  • Cloud platform certifications or hands-on experience (AWS, Azure, Google Cloud Platform).
  • Working knowledge of IAM tools, data security policies, and privacy-enhancing technologies.

Education & Certification:

  • Bachelor's Degree in Computer Science, Information Systems, Engineering, or related field.
  • Preferred:
    • BigID Certified Professional (if available)
    • Python Certifications (PCEP, PCAP)
    • Data Privacy certifications (e.g., CIPP/US, CIPT) a plus