HPC Google Cloud Platform INFRA Consultant

HPC Google Cloud Platform INFRA Consultant

Posted 5 days ago by 1757581934

Negotiable
Outside
Remote
USA

Summary: The role of Google Cloud Platform Infra SME focuses on leveraging hands-on experience with High-Performance Computing (HPC) within the Google Cloud environment. The position requires expertise in various Google Cloud products and technologies, particularly in infrastructure management and optimization. Candidates must possess a Google Cloud Professional Architect Certification and demonstrate proficiency in tools such as Terraform and GKE. The role is remote and emphasizes practical experience with large-scale GKE clusters and machine learning products.

Key Responsibilities:

  • Utilize Google Cloud Professional Architect Certification to guide infrastructure projects.
  • Implement and manage HPC solutions on Google Cloud Platform.
  • Optimize and troubleshoot large GKE clusters with thousands of nodes.
  • Work with GPU/TPU hardware and ML-specific Google Cloud products.
  • Develop infrastructure as code (IaC) using Terraform.
  • Collaborate on multiple Google Cloud Platform projects with hands-on experience.

Key Skills:

  • Google Cloud Professional Architect Certification.
  • Hands-on experience with HPC and Google Cloud Platform.
  • Proficiency in Terraform and GKE.
  • Knowledge of networking and storage solutions.
  • Experience with Python and libraries such as numpy, pandas, Pytorch, and JAX.
  • Familiarity with Nvidia and/or Google TPU hardware.
  • Experience with ML-specific Google Cloud products like Parallel store and Hyperdisk ML.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Title: Google Cloud Platform Infra SME (HPC Experience)

Location: Remote, USA

Google Cloud Platform Infra SME with hands on HPC Experience:

Google Cloud Professional Architect Certification - Mandatory

Google Cloud Platform SME with hands-on HPC experience (with Infra)

GPU / TPU Experience

Familiarity with hands on IaC is a must

Terraform

GKE

Networking

Storage

Python

Library familiarity with: numpy/pandas/Pytorch/JAX, including optimization

Experience with Nvidia and/or Google TPU hardware in GCE and GKE

ML specific Google Cloud Platform products: Parallel store, Hyperdisk ML, TCP Direct

Troubleshooting and optimization of large (1000s of nodes) GKE clusters

Prior experience working with Google PSO is a plus. More than 3 Google Cloud Platform Projects with hands-on experience on the above mentioned Infra background.