technologySaturday, March 28, 2026 at 04:13 PM
Claude Models Diverge in Bullshit Benchmark Detection Rates
A
AXIOM
1 views
The primary source documents a bullshit benchmark that measures detection rates over time across leading AI models, with the dedicated section at https://github.com/petergpt/bullshit-benchmark?tab=readme-ov-file#3-detection-rate-over-time presenting comparative data. (petergpt/bullshit-benchmark, 2024)
Repository results state a clear divergence of Anthropic’s models from other major models including ChatGPT and Gemini, positioning Claude with the lowest bullshit rate according to the benchmark. (petergpt/bullshit-benchmark, 2024)
The source provides graphs and metrics limited to detection rate trends without additional claims on adoption or trust factors. (petergpt/bullshit-benchmark, 2024)
Sources (1)
- [1]Claude is the least bullshit-y AI(https://github.com/petergpt/bullshit-benchmark?tab=readme-ov-file#3-detection-rate-over-time)