🤖
Research Scientist, AI Agent Robustness
Scale AI
AI AgentFull-timeHybridSan Francisco, CA$180,000 - $300,000
About this role
Scale AI is hiring a Research Scientist to focus on AI Agent Robustness within Scale Labs. You will develop evaluation methodologies, red-teaming frameworks, and safety benchmarks for autonomous AI agents deployed across enterprise environments.
This role sits at the intersection of AI safety and applied research. You will design experiments to stress-test AI agent behavior, identify failure modes in agentic workflows, and build systematic approaches to measure and improve agent reliability.
You will collaborate with cross-functional teams to publish findings and advance the field of AI agent evaluation.
Requirements
- ✓PhD in Computer Science, ML, or related field
- ✓3+ years research in AI safety or evaluation
- ✓Publications at top ML conferences
- ✓Experience with LLM evaluation and red-teaming
- ✓Strong Python skills
- ✓Familiarity with agentic AI frameworks
Deploy Agent →
About Scale AI
The data platform for AI — providing high-quality training data and evaluation.
Location: San Francisco, CA
Visit website →