Cloud Engineer

Cloud Engineer

Posted 7 days ago by Totaljobs

£77 Per hour
Inside
Hybrid
Manchester (M1)

Summary: World Wide Technology UK is seeking a hands-on Cloud Engineer with expertise in HPC and AI/ML performance workloads on Google Cloud Platform (GCP). The role involves benchmarking, optimizing, and validating performance across advanced accelerator platforms such as NVIDIA GPUs, AMD GPUs, and Google TPUs. This is a contract position with a duration of 6 months, requiring a hybrid working arrangement in Manchester. The role is classified as inside IR35.

Key Responsibilities:

  • Design and execute HPC & AI performance benchmarks (training, inference, scientific workloads)
  • Provision and optimize GPU/TPU-based infrastructure on GCP (A3/A4, TPU pods)
  • Analyze performance across frameworks (PyTorch, TensorFlow, JAX, CUDA, ROCm)
  • Identify system bottlenecks (compute, memory, network, I/O)
  • Build automation tools for benchmarking and reporting
  • Collaborate with teams to align workloads with optimal architecture

Key Skills:

  • Strong experience with GCP (Compute Engine, GKE, Storage, Networking)
  • Hands-on with NVIDIA (CUDA/NCCL), AMD (ROCm), and TPUs (XLA/JAX/TF)
  • Solid knowledge of HPC concepts (MPI, RDMA, InfiniBand, Slurm/Kubernetes)
  • Experience with performance benchmarks (MLPerf, HPL, NCCL, STREAM)
  • Proficiency in Python, Bash, and IaC tools (Terraform/Ansible)
  • Ability to analyze profiling tools (Nsight, TensorBoard, PyTorch Profiler)

Salary (Rate): £77 per hour

City: Manchester

Country: United Kingdom

Working Arrangements: hybrid

IR35 Status: inside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

World Wide Technology (WWT) is a global technology integrator and supply chain solutions provider. Through our culture of innovation, we inspire, build, and deliver business results, from idea to outcome.

World Wide Technology UK is looking for a hands-on Cloud Engineer with strong expertise in HPC and AI/ML performance workloads on Google Cloud Platform (GCP). The role focuses on benchmarking, optimizing, and validating performance across advanced accelerator platforms including NVIDIA GPUs, AMD GPUs, and Google TPUs.

This is a contract Role & Inside IR35

HPC AI Cloud Engineer Contract Duration: 6 months!

Location: Manchester, United Kingdom (Hybrid with 2-3 days a week to onsite)

Job Description:

Key Responsibilities

  • Design and execute HPC & AI performance benchmarks (training, inference, scientific workloads)
  • Provision and optimize GPU/TPU-based infrastructure on GCP (A3/A4, TPU pods)
  • Analyze performance across frameworks (PyTorch, TensorFlow, JAX, CUDA, ROCm)
  • Identify system bottlenecks (compute, memory, network, I/O)
  • Build automation tools for benchmarking and reporting
  • Collaborate with teams to align workloads with optimal architecture

Required Skills

  • Strong experience with GCP (Compute Engine, GKE, Storage, Networking)
  • Hands-on with NVIDIA (CUDA/NCCL), AMD (ROCm), and TPUs (XLA/JAX/TF)
  • Solid knowledge of HPC concepts (MPI, RDMA, InfiniBand, Slurm/Kubernetes)
  • Experience with performance benchmarks (MLPerf, HPL, NCCL, STREAM)
  • Proficiency in Python, Bash, and IaC tools (Terraform/Ansible)
  • Ability to analyze profiling tools (Nsight, TensorBoard, PyTorch Profiler)

Candidates will be required to go through background checks before commencing contract.

Must be eligible to live and work in the specified work location. Some occasional travel may be required. Only successful candidates will be contacted

EQUAL OPPORTUNITIES

World Wide Technology is committed to equal opportunities and actively seeks applications from all sectors of the community irrespective of sex, race, colour, nationality, ethnic or national origin, disability, marital status, sexual orientation, having responsibility for dependents, age, religion/beliefs, or any other reason which cannot be shown to be justified.