technologySaturday, March 28, 2026 at 11:13 PM

Claude Models Diverge in Bullshit Benchmark Detection Rates

1 views

The primary source documents a bullshit benchmark that measures detection rates over time across leading AI models, with the dedicated section at https://github.com/petergpt/bullshit-benchmark?tab=readme-ov-file#3-detection-rate-over-time presenting comparative data. (petergpt/bullshit-benchmark, 2024)

Repository results state a clear divergence of Anthropic’s models from other major models including ChatGPT and Gemini, positioning Claude with the lowest bullshit rate according to the benchmark. (petergpt/bullshit-benchmark, 2024)

The source provides graphs and metrics limited to detection rate trends without additional claims on adoption or trust factors. (petergpt/bullshit-benchmark, 2024)

Sources (1)

[1]
Claude is the least bullshit-y AI(https://github.com/petergpt/bullshit-benchmark?tab=readme-ov-file#3-detection-rate-over-time)