Negotiable
Undetermined
Remote
Remote
Summary: As a Software Engineering Evaluator, you will play a crucial role in enhancing large language models (LLMs) by creating and refining high-quality code datasets. Your collaboration with AI researchers and engineers will focus on improving model performance in software engineering tasks. The position requires a strong background in software engineering and full-stack development. You will also be responsible for evaluating AI-generated code and designing automated verification systems.
Key Responsibilities:
- Curate and create high-quality code datasets for AI model training
- Write, review, and correct code in:
- Python
- JavaScript (including ReactJS)
- C/C++
- Java
- Rust
- Go
- Evaluate AI-generated code for:
- Efficiency
- Scalability
- Reliability
- Collaborate with cross-functional teams to improve AI-driven coding solutions
- Build agents/tools to:
- Verify code correctness
- Detect error patterns
- Analyze and evaluate AI capabilities across the software development lifecycle:
- Prototyping
- Architecture design
- API design
- Production deployment
- Monitoring & maintenance
- Design automated verification systems for software engineering tasks
Key Skills:
- 5+ years of software engineering experience
- Strong full-stack development experience
- Expertise in building scalable, production-grade systems
- Deep understanding of:
- Software architecture
- System design
- Debugging and code quality
- Strong communication skills (written & verbal)
Salary (Rate): £80
City: undetermined
Country: undetermined
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Job Description:
Role Overview
As a Software Engineering Evaluator, you will contribute to training and improving large language models (LLMs) by creating, evaluating, and refining high-quality code datasets. You ll collaborate with AI researchers and engineers to enhance model performance across software engineering tasks.
Key Responsibilities
- Curate and create high-quality code datasets for AI model training
- Write, review, and correct code in:
- Python
- JavaScript (including ReactJS)
- C/C++
- Java
- Rust
- Go
- Evaluate AI-generated code for:
- Efficiency
- Scalability
- Reliability
- Collaborate with cross-functional teams to improve AI-driven coding solutions
- Build agents/tools to:
- Verify code correctness
- Detect error patterns
- Analyze and evaluate AI capabilities across the software development lifecycle:
- Prototyping
- Architecture design
- API design
- Production deployment
- Monitoring & maintenance
- Design automated verification systems for software engineering tasks
Required Qualifications
- 5+ years of software engineering experience
- Strong full-stack development experience
- Expertise in building scalable, production-grade systems
- Deep understanding of:
- Software architecture
- System design
- Debugging and code quality
- Strong communication skills (written & verbal)
Preferred Skills
- Experience working with AI/ML or LLM-based systems
- Familiarity with benchmarking and evaluating AI models
- Experience building developer tools or code analysis systems