Negotiable
Undetermined
Undetermined
London Area, United Kingdom
Summary: The role involves supporting the design and execution of an adversarial testing framework for AI models, particularly in high-risk environments. Candidates must have experience in safety evaluation and testing frameworks for generative models. The position is a three-month contract with potential for extension, reporting to a strategic lead. The role requires a combination of technical expertise in AI safety and hands-on testing capabilities.
Key Responsibilities:
- Define adversarial prompts and develop a taxonomy of attack types and failure modes.
- Translate real user behavior into structured attack vectors and define evaluation methodologies.
- Augment existing prompt libraries with adversarial variants and design systematic prompt mutation strategies.
- Run adversarial tests against selected model versions and classify failure types.
- Produce formal testing reports and present findings to technical and policy audiences.
Key Skills:
- Strong background in AI safety, red teaming, or adversarial ML.
- Experience testing LLMs and generative models.
- Familiarity with prompt injection, jailbreaks, and boundary attacks.
- Understanding of multimodal models (text, image, video).
- Ability to design rigorous testing methodologies and perform failure analysis.
- Clear communication skills across technical and policy stakeholders.
Salary (Rate): undetermined
City: London Area
Country: United Kingdom
Working Arrangements: undetermined
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Company Description T3 partners with organizations deploying production AI systems in high-risk environments where failures can have significant regulatory, operational, or safety implications. With a team instrumental in shaping global AI standards and governance frameworks, T3 provides AI assurance services to major Big Tech companies and complex enterprises. This is a three month contract with the opportunity to extend. Candidates must have direct experience working with frontier labs or large technology companies on safety evaluation, red teaming, or adversarial testing. Experience designing and operationalising testing frameworks for production grade generative models is essential.
Role Description Support the design and execution of a structured adversarial testing framework across LLM, image, and video generation models. Responsible for developing the SOP, adversarial methodology, prompt expansion strategy, and delivery of formal testing reports aligned to client policy. It requires deep safety domain expertise combined with hands on testing capability. The role will report to a strategic lead.
Core Responsibilities
- Adversarial Testing Framework Design
- Define what constitutes truly adversarial prompts
- Develop taxonomy of attack types and failure modes
- Translate real user behaviour into structured attack vectors
- Define evaluation methodology across LLM, image, and video models
- Create severity and risk classification frameworks
- Test Set Development & Expansion
- Augment existing prompt libraries with adversarial variants
- Design systematic prompt mutation strategies
- Develop model specific adversarial patterns
- Define coverage metrics for risk domains
- Identify blind spots in current test libraries
- Evaluation & Execution
- Run adversarial tests against selected model versions
- Classify and analyse failure types
- Map outputs against internal policy requirements
- Produce structured evaluation findings
- Support reuse of client internal evaluation platform
- Reporting & Stakeholder Communication
- Produce formal testing reports
- Present findings to technical and policy audiences
- Clearly distinguish methodology from execution
- Define remediation pathways and improvement loops
Required Skills & Experience
- Technical
- Strong background in AI safety, red teaming, or adversarial ML
- Experience testing LLMs and generative models
- Familiarity with prompt injection, jailbreaks, and boundary attacks
- Understanding of multimodal models (text, image, video)
- Experience defining structured evaluation frameworks
- Knowledge of benchmark design and failure taxonomy creation
- Domain Knowledge
- Responsible AI principles
- Risk classification frameworks
- Safety policy interpretation
- Alignment evaluation concepts
- Model evaluation lifecycle
- Analytical
- Ability to design rigorous testing methodology
- Strong failure analysis capability
- Quantitative and qualitative evaluation skills
- Ability to convert abstract risks into concrete test cases
- Soft Skills
- Comfortable operating in ambiguous scope environments
- Clear communicator across technical and policy stakeholders
- Able to work without creating single point dependency risk
- Structured thinker
Ideal Background
- AI safety researcher
- Red teaming lead for generative AI systems
- AI evaluation specialist
- Experience in frontier or production generative systems
- Experience with model benchmarking and structured evaluation labs