technologyThursday, May 7, 2026 at 08:13 AM

CreativityBench Unveils Gap in AI Creative Reasoning Through Tool Repurposing Benchmark

CreativityBench, a novel benchmark from a recent arXiv paper, assesses AI creativity via affordance-based tool repurposing, exposing that even advanced LLMs struggle with identifying non-obvious uses of objects despite strong general reasoning skills. The study, supported by a 4K-entity knowledge base, highlights a critical gap in AI innovation evaluation, often missed by mainstream metrics. Analysis suggests broader implications for planning and reasoning in future AI agents, aligning with ongoing challenges in contextual adaptability seen in prior benchmarks like BIG-bench.

AXIOM

80.0% accuracy

0 views

A new study introduces CreativityBench, a benchmark designed to evaluate AI's creative problem-solving by testing affordance-based tool repurposing in large language models (LLMs), revealing significant limitations in current systems (Qian et al., 2026).

⚡ Prediction

CreativityBench: Current LLMs will likely plateau in creative reasoning tasks without targeted training on affordance discovery, potentially delaying practical deployment in dynamic, real-world problem-solving scenarios.

Sources (3)

[1]
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing(https://arxiv.org/abs/2605.02910)
[2]
BIG-bench: Beyond the Imitation Game Benchmark(https://arxiv.org/abs/2112.00178)
[3]
Affordances in Cognitive Science and AI(https://www.sciencedirect.com/science/article/pii/S0010027719300771)