Paper Summary
Paperzilla title
LLMs Think in Shapes: Do Words Have Geometry?
This paper proposes a theory of how large language models (LLMs) represent features as manifolds, geometric shapes in the model's internal representation space. They suggest that cosine similarity between representations reflects the distance between features, and offer some supporting evidence by analyzing text embeddings and activations from models like GPT2-small and text-embedding-large-3.
Possible Conflicts of Interest
None identified
Identified Weaknesses
The study primarily focuses on a few specific models and datasets, limiting the generalizability of findings to other LLMs and domains.
Demonstrating perfect correspondence between cosine similarity and feature distance (isometry) is difficult due to noise and the complexity of semantic similarity, impacting the robustness of the proposed framework.
The study relies on manually selecting metric spaces for features, which makes scaling the approach to complex features challenging. An automated metric learning method is still unexplored.
Simplified Feature Representation
Reducing complex features like "years" or "colors" to simple metric spaces may be an oversimplification of how LLMs actually represent them. The true representation might be much richer.
Rating Explanation
This paper presents a novel and interesting theoretical framework for understanding feature representation in LLMs. While the empirical validation is preliminary and faces some methodological challenges, the proposed concepts and hypotheses offer a valuable starting point for future research in mechanistic interpretability. The limitations regarding generalizability, difficulty proving isometry, manual metric selection, and potential oversimplification are significant, but do not negate the value of the theoretical contribution, warranting a rating of 4.
Good to know
This is our free standard analysis. Paperzilla Pro fact-checks every citation, researches author backgrounds and funding sources, and uses advanced AI reasoning for more thorough insights.
File Information
Original Title:
The Origins of Representation Manifolds in Large Language Models
Uploaded:
September 16, 2025 at 06:11 PM
© 2025 Paperzilla. All rights reserved.