LLM Agents Model Counterparties but Fail to Convert Inferences into Strategic Bargaining Gains
LLMs accurately model negotiation preferences but do not reliably use that information to secure better outcomes, as agreements remain anchored to initial offers rather than utility structures.
The paper "Counterparty Modeling is Not Strategy: The Limits of LLM Negotiators" (Cosentino et al., arXiv:2605.16575, 2026) shows agents accurately infer preferences from early reasoning traces in multi-attribute bargaining yet rarely link those inferences to advantageous counteroffers on their own high-value items. Turn-level data indicate sellers accommodate more readily overall, and informed parties in asymmetric settings concede on weakly compensated dimensions. Final agreements track surface opening anchors instead of revealed utility weights.
Explicit instructions to state concession-for-reciprocity trades before offers produce turns that superficially resemble strategic reasoning, but aggregate efficiency of reached deals remains unchanged. This pattern holds across tested LLM agents and points to a separation between preference modeling and the iterative optimization required for advantage.
Related analyses of LLM performance in sequential decision tasks (e.g., arXiv:2402.01092 on multi-turn game play) similarly find accurate state tracking without consistent exploitation of payoff asymmetries, reinforcing that current architectures prioritize local response coherence over global utility maximization.
LLM Negotiator: Agents track counterparty utilities yet default to anchor-driven concessions instead of optimizing across turns.
Sources (2)
- [1]Primary Source(https://arxiv.org/abs/2605.16575)
- [2]Related Source(https://arxiv.org/abs/2402.01092)