Evolving Generative Engines
The effectiveness of GEO methods may need continuous adaptation as generative engines (GEs) evolve, similar to how traditional SEO adapted to search engine changes. This implies that current optimal strategies might become outdated.
The GEO-BENCH benchmark, while diverse, requires continuous updates as real-world queries evolve, which could impact the long-term applicability and relevance of current findings.
Search Ranking vs. Impression Evaluation
The study focuses on content visibility and impression within GE responses, but does not evaluate how GEO methods affect traditional search rankings. While text-based changes are less likely to impact metadata-driven rankings, this remains an unaddressed aspect of overall website performance.
Reliance on Future LLM Capabilities
The assumption that future generative engines will be able to ingest more sources due to larger context lengths in language models suggests a future-oriented perspective that may not fully reflect current limitations and could reduce the impact of some findings over time.
Subjectivity in Benchmarking
The GEO-BENCH tagging, which uses GPT-4 and manual verification, acknowledges potential discrepancies due to subjective interpretations or labeling errors, which could affect the benchmark's ultimate reliability.
Subset Evaluation for Key Experiments
The evaluation of GEO methods on Perplexity.ai (a real-world GE) and the analysis of combined GEO strategies were both conducted on only a subset of 200 samples, rather than the full GEO-BENCH, potentially limiting the generalizability of these specific findings.