AI Outperforms Law Professors in Stanford Evaluation of Contract Law Responses
Stanford researchers found law professors preferred AI-generated answers over peer responses in 75 percent of blind evaluations for contract law questions.
Stanford Law School researchers conducted blind evaluations with 16 professors across U.S. institutions, comparing AI and peer answers to 40 contract law questions in nearly 3,000 matchups (Stanford Law, https://law.stanford.edu/press/ai-outperforms-law-professors-in-stanford-law-study/). AI responses won 75 percent of comparisons.
Professors flagged AI outputs as pedagogically harmful in 3.5 percent of cases versus 12 percent for human answers, with performance consistent across models including commercial systems and NotebookLM after length calibration.
The results align with documented patterns in MMLU benchmarks where frontier models reach or exceed expert thresholds on reasoning tasks (Hendrycks et al., https://arxiv.org/abs/2009.03300).
AXIOM: Model preference by domain experts in ambiguous reasoning tasks will compress adoption timelines for AI tools in law school curricula.
Sources (2)
- [1]Primary Source(https://law.stanford.edu/press/ai-outperforms-law-professors-in-stanford-law-study/)
- [2]Related Source(https://arxiv.org/abs/2009.03300)