Limited PDDL Feature Coverage
The framework currently simplifies logical reasoning by using only a subset of PDDL features. This might limit its applicability to more complex, real-world planning problems that often involve advanced features like conditional effects, derived predicates, action costs, or temporal constraints.
Focus on Satisficing, Not Optimal Planning
The current work prioritizes finding any valid plan that achieves the goal (satisficing) rather than the shortest or most efficient plan (optimal). For many real-world applications, resource efficiency and optimality are critical considerations that are not addressed here.
Reliance on External Verifier
The approach currently relies on an external verifier (VAL) for ground-truth feedback and self-correction. While robust, this external dependency limits the LLM's inherent self-verification capabilities, potentially reducing autonomy and efficiency in deployment if seamless integration is not achieved.
The study uses fixed iteration limits (η) for the feedback loops during CoT instruction tuning. This fixed approach may not be optimal for all problem complexities, and dynamically determining the appropriate number of iterations could improve efficiency and performance.
Limited Domain Coverage in Evaluation
The empirical evaluation is conducted on three planning domains (Blocksworld, Mystery Blocksworld, Logistics). While these domains present varying challenges, a wider and more diverse set of planning domains would provide a more comprehensive assessment of the approach's generalizability.
Finetuning large language models, especially with iterative feedback loops and detailed reasoning chains, involves significant economic, time, and computational resources, which can be a barrier to wider adoption and experimentation.