The Illusion of the Illusion of Thinking: A Comment on Shojaee et al. (2025)
Overview
Paper Summary
The paper argues that a previous study's findings of "accuracy collapse" in Large Reasoning Models on complex planning puzzles are due to experimental design limitations, specifically output token limits and unsolvable problem instances. By using alternative representations that bypass these limitations, the authors suggest that models can solve tasks previously deemed too complex.
Explain Like I'm Five
Scientists found that when AI seemed to fail hard puzzles, it was often because the test was unfair or the puzzles were impossible. When the tests were made fairer, the AI could solve them after all!
Possible Conflicts of Interest
The authors are affiliated with Anthropic and Open Philanthropy, which may have interests in promoting positive views of AI capabilities. However, the critique primarily addresses methodological concerns, reducing the likelihood of significant bias.
Identified Limitations
Rating Explanation
The paper provides a valuable critique of experimental design in AI research, highlighting the importance of considering output constraints. However, its lack of novel findings, reliance on preliminary tests, and anecdotal evidence limits its impact.
Good to know
This is the Starter analysis. Paperzilla Pro fact-checks every citation, researches author backgrounds and funding sources, and uses advanced AI reasoning for more thorough insights.
Explore Pro →