Red Team Safety Classifier Evaluation AI Trainer, $55–$65/hour

Remote · USA Full-time New today

Project Overview: Join a growing community of professionals advancing the next wave of AI. As an AI Trainer, you’ll play a hands-on role by analyzing and providing feedback on data to improve LLM performance, helping ensure that the next generation of AI technology is accurate and trustworthy. We are seeking a skilled AI Safety Evaluator / Red Team Prompt Engineer to work as a project consultant in our AI Labor Marketplace. This is not a full-time employment position — you will be engaged as an expert project consultant on a contract basis. Location: U.S.-based experts only Engagement: Part-time, project-based expert evaluation work Work Type: Remote Project Summary: A fast-paced AI safety evaluation sprint focused on adversarial prompt generation and safety classification. Contributors will create and assess high-difficulty, edge-case scenarios, applying structured labeling, severity scoring, and policy-based reasoning to improve model safety performance. Consultant Engagement Terms: This is a project-based consultant role. Consultants will be paid on a per-project basis; hourly rates are estimates based on anticipated completion time. Consultants control their own schedule, provide their own tools, and may simultaneously provide services to other vendors/employers (subject to those vendors’ allowances). Responsibilities: Contributors will:

Design adversarial prompts that expose edge cases in AI safety systems
Apply structured safety classifications, including category and severity
Write concise, policy-grounded rationales for decisions
Review and validate peer submissions for accuracy and quality
Identify ambiguous or difficult-to-classify scenarios
Maintain consistency across high-volume evaluation tasks

Expected Outcomes:

High-quality adversarial examples suitable for model evaluation
Accurate and consistent safety labels and severity ratings
Clear, defensible rationales aligned with policy guidelines
Reliable QA feedback improving dataset quality

Qualifications:

Experience in AI safety, LLM evaluation, red teaming, or trust & safety
Strong prompt engineering and analytical reasoning skills
Familiarity with safety taxonomies and policy-based classification
Ability to work independently and maintain high-quality output
Prior experience with annotation or evaluation platforms preferred

Apply tot his job Apply To this Job

Apply

Red Team Safety Classifier Evaluation AI Trainer, $55–$65/hour

Related roles

Video Remote Interpreting

Senior Software Engineer, Trust and Safety

Remote Educational Interpreter | North Carolina

Community Interpreter

Kyrgyz Interpreter

Karen:Interpreter

Oromo Interpreter

English/Other Languages Remote Contract Interpreter (OPI)

ASL Medical Video Interpreter - PT

Spanish Medical Interpreters (Latin America and the Caribbean)

Experienced Work from Home Customer Service Representative – Delivering Exceptional Customer Experiences in a Dynamic Remote Environment

Experienced Remote Customer Support Specialist – Deliver Exceptional Service from the Comfort of Your Own Home

Product Manager - AI Trainer - Freelance - 8-20hrs/week - Remote

Regulatory and Start Up Specialist - Sweden

Experienced Part-time Remote Data Entry Assistant – Finance Operations Support

Experienced Customer Support Associate – Remote Job Opportunity at arenaflex

Experienced Customer Support Representative – Remote

Strategic Client Growth Director-RCM Sales

Interim Payroll Director - Remote (Oracle Fusion Required)

Experienced Full Stack Data Entry Specialist – Web & Cloud Application Development