Negotiable
Outside
Remote
USA
Summary: This role is a part-time, fully remote position for Professional Software Engineers to support an AI innovation initiative. The job involves evaluating and improving AI tasks related to software engineering through model competency assessment and feedback. Candidates should have recent experience in their field and familiarity with large language models (LLMs) is preferred. The position requires a commitment of 5 hours per day, 2 days a week for a duration of 10 weeks.
Key Responsibilities:
- Create challenging SWE-related prompts based on visuals, datasets, or model outputs.
- Evaluate AI-generated responses using provided rubrics.
- Develop or improve rubrics to ensure they reflect accurate reasoning for complex tasks.
- Review and stress-test model outputs for accuracy and logical consistency within your domain.
- Provide clear, structured feedback based on your subject-matter knowledge.
Key Skills:
- Bachelor's degree or higher.
- Experience in a related field.
- Experience using large language models (LLMs) is ideal.
- Recent work experience in the actual field rather than AI training.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
This is 10 week part time job with 5 hours per day and 2 days a week.
Job description
Company is partnering with an AI innovation initiative seeking Professional Software Engineers to support the evaluation and improvement of software engineering-focused AI tasks.
In this role, you ll use your engineering and systems-level reasoning to assess model competency in real-world technical domains, debugging, architecture decisions, and system behavior analysis.
What You ll Do
- Create challenging SWE-related prompts based on visuals, datasets, or model outputs.
- Evaluate AI-generated responses using provided rubrics.
- Develop or improve rubrics to ensure they reflect accurate reasoning for complex tasks.
- Review and stress-test model outputs for accuracy and logical consistency within your domain.
- Provide clear, structured feedback based on your subject-matter knowledge.
Ideal candidate profile
Experience using an LLM is ideal. They want someone who has been working in their actual field most recently, rather than AI training, but experience working with LLMs is a plus.
Daily tasks
Design problems and evaluate AI prompts based on the domain expertise. Provide thorough feedback.
Required skills
Need a bachelor s degree or higher, experience in related field.