Limited external validation
While the dataset used for training and validation is large, the external testing dataset is smaller and the generalizability to diverse datasets is limited.
The study heavily relies on deep learning, which can be a black box and may not be easily interpretable for clinical decision-making.
Computational cost analysis
The paper does not provide any analysis of the computational resources required for training and running the AI system, which may impact its practical applicability in clinical settings.
The study claims full automation, but expert review and occasional corrections were still needed, implying that the system is not truly fully automatic.
The comparison with other deep-learning methods is not entirely fair as those methods were retrained on different datasets than their original publications. Also, existing state-of-the-art methods for ROI generation and localization could have been compared but were excluded.
Reduced performance with metal implants
The Al system performs slightly worse on cases with metal implants, which is a common occurrence in dental practice, raising concerns about its robustness in real-world scenarios.