Theoretical Simplifications
The theoretical analysis relies on simplified assumptions (e.g., single data point, w(t)=1 for effective target derivation), which are acknowledged as unrealistic for real-world data, potentially limiting the direct applicability of the error bounds.
The derived error bounds are orders of magnitude larger than typical magnitudes of generated data, and are not mathematically strict for the custom uEDM model. While useful for intuition, they are not precise quantitative predictors.
Reimplementation Discrepancies
The authors note that their reimplementations of some models (iCT, EDM on FFHQ-64) did not fully reproduce original reported results, which might slightly affect the precise quantitative comparisons, though they assert the overall trends remain meaningful.
Limited Hyperparameter Tuning
Hyperparameters were primarily tuned for noise-conditional models and then directly applied to noise-unconditional variants. Further tuning for noise-unconditional models could potentially yield even better performance, suggesting the reported improvements might be conservative.
While results were extended to ImageNet and FFHQ, the core experimental findings and competitive uEDM results were primarily demonstrated on CIFAR-10, a relatively low-resolution and less complex dataset compared to state-of-the-art generative model benchmarks.