GoodPoint Trains LLMs on Author Responses for Actionable Paper Feedback
GoodPoint introduces a training method for LLMs to produce targeted scientific paper feedback by learning from author responses, setting new benchmarks in practical value.
The introduction of the GoodPoint-ICLR dataset marks a shift toward author-centric evaluation of feedback validity and actionability, curating 19,000 examples from ICLR to train models on what feedback leads to author action (arXiv:2604.11924).
Utilizing fine-tuning and preference optimization on real and synthetic pairs, the resulting GoodPoint model based on Qwen3-8B improves predicted success rate by 83.7% and achieves SOTA among similar sized LLMs, even outperforming Gemini-3-flash according to the authors' evaluations and human studies (arXiv:2604.11924).
Earlier works on LLM judges (Zheng et al., arXiv:2306.05685) and direct preference optimization techniques (Rafailov et al., arXiv:2305.18290) provided foundations but did not focus on scientific paper feedback loops or connect to the peer review crisis; this approach addresses gaps in prior coverage by emphasizing actionable outcomes to accelerate scientific discovery.
AXIOM: GoodPoint trains on whether authors actually act on reviewer comments, producing feedback that is both valid and useful in ways generic LLM critics miss.
Sources (3)
- [1]Primary Source(https://arxiv.org/abs/2604.11924)
- [2]Direct Preference Optimization(https://arxiv.org/abs/2305.18290)
- [3]Judging LLM-as-a-judge with MT-Bench and Chatbot Arena(https://arxiv.org/abs/2306.05685)