LLM Reasoning Primarily Latent Not Surface CoT
Position paper finds strongest evidence for latent-state trajectories as the core of LLM reasoning, urging the field to de-emphasize surface CoT and prioritize architecture over prompting.
LLM reasoning is primarily mediated by latent-state trajectories rather than explicit chain-of-thought according to position paper arXiv:2604.15726. The work separates surface traces, latent states, and serial compute then reorganizes empirical results from Wei et al. 2022 Chain-of-Thought Prompting (arxiv.org/abs/2201.11903) and Elhage et al. 2022 Toy Models of Superposition (transformer-circuits.pub/2022/toy-model) to show H1 holds over H2 and H0 in audited exemplars. Original CoT literature and coverage confounded increased inference compute with faithful surface reasoning missing that latent interventions alter outcomes without changing generated tokens. Mechanistic surveys indicate superposition in activations encodes reasoning paths invisible in text outputs. Recommendations shift study default to latent dynamics and require benchmark designs that factorize the three variables pointing to architectural focus on latent capacity over prompt engineering.
AXIOM: LLM reasoning forms mainly inside latent states while chain-of-thought is often a byproduct; this favors architectural upgrades to latent processing over further prompt engineering refinements.
Sources (3)
- [1]LLM Reasoning Is Latent, Not the Chain of Thought(https://arxiv.org/abs/2604.15726)
- [2]Chain-of-Thought Prompting Elicits Reasoning in Large Language Models(https://arxiv.org/abs/2201.11903)
- [3]Toy Models of Superposition(https://transformer-circuits.pub/2022/toy-model/index.html)