Paper Summary
Paperzilla title
LLMs Not Dumb, Just Too Wordy? Study Finds AI Can Solve Puzzles If We Don't Make Them Write Novels About It
The paper argues that a previous study's findings of "accuracy collapse" in Large Reasoning Models on complex planning puzzles are due to experimental design limitations, specifically output token limits and unsolvable problem instances. By using alternative representations that bypass these limitations, the authors suggest that models can solve tasks previously deemed too complex.
Possible Conflicts of Interest
The authors are affiliated with Anthropic and Open Philanthropy, which may have interests in promoting positive views of AI capabilities. However, the critique primarily addresses methodological concerns, reducing the likelihood of significant bias.
Identified Weaknesses
Lack of Novel Contribution
The paper primarily critiques a previous study's methodology, rather than presenting novel research. The focus is on demonstrating how experimental design flaws led to misinterpretations of LLM capabilities, specifically regarding output constraints and problem solvability.
The authors mention conducting "preliminary testing" with alternative representations, but the details are scarce. The sample size is acknowledged as insufficient for statistical significance, limiting the strength of their counter-arguments. More robust experimentation is needed to support their claims of restored performance.
Over-Reliance on Anecdotal Evidence
The paper heavily relies on anecdotal evidence, such as a tweet and model outputs expressing awareness of length limits. While illustrative, these examples don't constitute rigorous scientific proof.
Rating Explanation
The paper provides a valuable critique of experimental design in AI research, highlighting the importance of considering output constraints. However, its lack of novel findings, reliance on preliminary tests, and anecdotal evidence limits its impact.
Good to know
This is our free standard analysis. Paperzilla Pro fact-checks every citation, researches author backgrounds and funding sources, and uses advanced AI reasoning for more thorough insights.
File Information
Original Title:
The Illusion of the Illusion of Thinking: A Comment on Shojaee et al. (2025)
Uploaded:
July 08, 2025 at 12:15 PM
© 2025 Paperzilla. All rights reserved.