Golang Developer

Posted 2 days ago by 1761975713

Apply

Negotiable

Outside

Remote

USA

Apply

Summary: The role of a Golang Developer involves developing and maintaining an inference platform for large language models, with a focus on cloud and distributed services. Candidates should possess strong Golang skills and experience with modern cloud environments, as well as familiarity with large language models and GPU technologies. The position requires effective communication skills for collaboration with team members and documentation purposes. This is a remote position based in the USA, classified as outside IR35.

Key Responsibilities:

Develop and maintain an inference platform for serving large language models optimized for various GPU platforms.
Work on complex AI and cloud engineering projects through the entire product development lifecycle (PDLC).
Build tooling and observability to monitor system health and auto tuning capabilities.
Build benchmarking frameworks to test model serving performance for system and infrastructure tuning.
Build native cross-platform inference support across NVIDIA and AMD GPUs for various model architectures.
Contribute to open source inference engines to enhance performance on DigitalOcean cloud.

Key Skills:

Proficiency in Golang for building scalable and performant backend services.
Deep experience in modern cloud environments and distributed systems.
Experience with Large Language Models (LLMs) and hosting them for inference.
Strong verbal and written communication skills.
Experience with benchmarking tools for evaluating LLM inference.
Familiarity with LLM performance metrics.
Experience with inference engines like vLLM, SGLang, and Modular Max.
Familiarity with distributed inference serving frameworks.
Experience with AMD and NVIDIA GPUs and related software.
Knowledge of distributed inference optimization techniques.

Salary (Rate): undetermined

City: undetermined

Country: USA

Working Arrangements: remote

IR35 Status: outside IR35

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

Role Golang Developer

Location Remote

Job Description:

We are looking for devs with general cloud services / distributed services experience, with LLM experience as a secondary skill. GPU experience is now low on the list of preferred skills: Dedicated Inference Service

Required Skills-

Proficiency in Golang for building scalable and performant backend services.
Deep experience building services in modern cloud environments on distributed systems (i.e., containerization (Kubernetes, Docker), infrastructure as code, CI/CD pipelines, APIs, authentication and authorization, data storage, deployment, logging, monitoring, alerting, etc.)
Experience working with Large Language Models (LLMs), particularly hosting them to run inference
Strong verbal and written communication skills. Your job will involve communicating with local and remote colleagues about technical subjects and writing detailed documentation.

Experience with building or using benchmarking tools for evaluating LLM inference for various models, engine, and GPU combinations.
Familiarity with various LLM performance metrics such as prefill throughput, decode throughput, TPOT, and TTFT
Experience with one or more inference engines: e.g., vLLM, SGLang, and Modular Max
Familiarity with one or more distributed inference serving frameworks: e.g., llm-d, NVIDIA Dynamo, and Ray Serve etc.
Experience with AMD and NVIDIA GPUs, using software like CUDA, ROCm, AITER, NCCL, RCCL, etc.
Knowledge of distributed inference optimization techniques - tensor/data parallelism, KV cache optimizations, smart routing etc.

What You'll Be Working On-

Develop and maintain an inference platform for serving large language models optimized for the various GPU platforms they will be run on.
Work on complex AI and cloud engineering projects through the entire product development lifecycle (PDLC) - ideation, product definition, experimentation, prototyping, development, testing, release, and operations.
Build tooling and observability to monitor system health, and build auto tuning capabilities.
Build benchmarking frameworks to test model serving performance to guide system and infrastructure tuning efforts.
Build native cross platform inference support across NVIDIA and AMD GPUs for a variety of model architectures.
Contribute to open source inference engines to make them perform better on DigitalOcean cloud.

Apply

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)

National Insurance

Holiday Pay

Expenses

Pensions

Maternity Pay

Sick Pay

What Is A Limited Company?

Limited Company vs Sole Trader

Incorporation

Taxes

Filing Responsibilities

Bookkeeping

Insurance

Expenses

Buying a Car or Van

Capital Allowances

Benefits In Kind

Pensions

Employing A Spouse

Managing Excess Money

Dormant Companies

Closing Your Company

Withdrawing Money

Business Asset Disposal Relief

How To Become A Contractor

Inside IR35 Checklist

Outside IR35 Checklist

Self-Assessment Tax Returns

Mortgages

Pensions

Working Multiple Contracts

What is the £100k Abatement?

Inside IR35

Outside IR35

Permanent Employee

IR35

Umbrella Companies

Limited Companies

First Time Contractors

What Is IR35?

InsideIR35

Outside IR35

The Cost of IR35

IR35 Assessments

IR35 Rules

IR35 Compliance

Expenses

Foreign Companies

Overseas Contractors

Limited Companies

Sole Traders

What Is An Umbrella Company?

Choosing an Umbrella Company

Tax and Pay

Tax Avoidance

Fees (Margin)