Negotiable
Outside
Remote
USA
Summary: The Dataiku Administrator role involves managing and optimizing the Dataiku DSS platform within an AWS cloud environment. The candidate will be responsible for ensuring the stability, performance, and scalability of analytics infrastructure, which supports various data science and business intelligence initiatives. Key tasks include platform administration, Spark integration, cloud infrastructure management, and providing technical support to users. The position requires strong expertise in Apache Spark and AWS services, along with collaboration with data teams and IT.
Key Responsibilities:
- Manage and maintain the Dataiku DSS platform in AWS, including upgrades, patching, and configuration.
- Monitor system health, performance, and resource utilization across Spark clusters and Dataiku nodes.
- Implement and maintain user access controls, roles, and permissions in alignment with SOX and other compliance requirements.
- Configure and tune Spark execution environments for optimal performance within Dataiku workflows.
- Troubleshoot Spark-related job failures and performance bottlenecks.
- Collaborate with data engineers and scientists to optimize Spark recipes and pipelines.
- Work closely with DevOps and Cloud Engineering teams to manage EC2 instances, S3 buckets, IAM roles, and networking components.
- Develop and maintain automation scripts for platform monitoring, user provisioning, and job scheduling.
- Integrate with logging and alerting tools (e.g., CloudWatch, ELK, Prometheus) to proactively detect and resolve issues.
- Provide technical support and training to Dataiku users across departments.
- Act as a liaison between data teams and IT to ensure platform alignment with organizational goals.
Key Skills:
- Strong expertise in Apache Spark, including performance tuning and troubleshooting.
- Hands-on experience with AWS services (EC2, S3, IAM, CloudFormation, etc.).
- Experience with Kubernetes, EMR, or other Spark orchestration tools.
- Proficiency in Python, Bash, and/or other scripting languages.
- Familiarity with CI/CD pipelines and infrastructure-as-code tools (e.g., Terraform, GitLab CI).
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
# Remote - US (EST/CST timezones)
Job Summary:
Dataiku Administrator to manage and optimize our Dataiku DSS platform deployed in an AWS cloud environment. The ideal candidate will have hands-on experience with Apache Spark, cloud-native architecture, and enterprise data governance. This role is critical to ensuring the stability, performance, and scalability of our analytics infrastructure, which supports a wide range of data science and business intelligence initiatives.
Key Responsibilities:
Platform Administration:
- Manage and maintain the Dataiku DSS platform in AWS, including upgrades, patching, and configuration.
- Monitor system health, performance, and resource utilization across Spark clusters and Dataiku nodes.
- Implement and maintain user access controls, roles, and permissions in alignment with SOX and other compliance requirements.
Spark Integration & Optimization:
- Configure and tune Spark execution environments for optimal performance within Dataiku workflows.
- Troubleshoot Spark-related job failures and performance bottlenecks.
- Collaborate with data engineers and scientists to optimize Spark recipes and pipelines.
Cloud Infrastructure Management:
- Work closely with DevOps and Cloud Engineering teams to manage EC2 instances, S3 buckets, IAM roles, and networking components.
Automation & Monitoring:
- Develop and maintain automation scripts for platform monitoring, user provisioning, and job scheduling.
- Integrate with logging and alerting tools (e.g., CloudWatch, ELK, Prometheus) to proactively detect and resolve issues.
Collaboration & Support:
- Provide technical support and training to Dataiku users across departments.
- Act as a liaison between data teams and IT to ensure platform alignment with organizational goals.
Required Qualifications:
- Strong expertise in Apache Spark, including performance tuning and troubleshooting.
- Hands-on experience with AWS services (EC2, S3, IAM, CloudFormation, etc.).
- Experience with Kubernetes, EMR, or other Spark orchestration tools.
- Proficiency in Python, Bash, and/or other scripting languages.
- Familiarity with CI/CD pipelines and infrastructure-as-code tools (e.g., Terraform, GitLab CI).
Preferred Qualifications:
- Experience administering Dataiku DSS in a production environment, or
- Dataiku certification (Administrator or Advanced Designer).
- Knowledge of Dataiku plugin development and API integration.
- Experience supporting enterprise-scale analytics platforms.
- Understanding of data governance, security, and compliance frameworks (e.g., SOX, GDPR).