arXiv Analytics

Sign in

arXiv:2311.09753 [cs.CV]AbstractReferencesReviewsResources

DIFFNAT: Improving Diffusion Image Quality Using Natural Image Statistics

Aniket Roy, Maiterya Suin, Anshul Shah, Ketul Shah, Jiang Liu, Rama Chellappa

Published 2023-11-16Version 1

Diffusion models have advanced generative AI significantly in terms of editing and creating naturalistic images. However, efficiently improving generated image quality is still of paramount interest. In this context, we propose a generic "naturalness" preserving loss function, viz., kurtosis concentration (KC) loss, which can be readily applied to any standard diffusion model pipeline to elevate the image quality. Our motivation stems from the projected kurtosis concentration property of natural images, which states that natural images have nearly constant kurtosis values across different band-pass versions of the image. To retain the "naturalness" of the generated images, we enforce reducing the gap between the highest and lowest kurtosis values across the band-pass versions (e.g., Discrete Wavelet Transform (DWT)) of images. Note that our approach does not require any additional guidance like classifier or classifier-free guidance to improve the image quality. We validate the proposed approach for three diverse tasks, viz., (1) personalized few-shot finetuning using text guidance, (2) unconditional image generation, and (3) image super-resolution. Integrating the proposed KC loss has improved the perceptual quality across all these tasks in terms of both FID, MUSIQ score, and user evaluation.

Related articles: Most relevant | Search more
arXiv:1304.0023 [cs.CV] (Published 2013-03-29, updated 2014-09-23)
The two-dimensional Gabor function adapted to natural image statistics: An analytical model of simple-cell responses in the early visual system
arXiv:2208.03961 [cs.CV] (Published 2022-08-08)
Sampling Based On Natural Image Statistics Improves Local Surrogate Explainers
arXiv:1412.2697 [cs.CV] (Published 2014-12-08)
Image quality assessment measure based on natural image statistics in the Tetrolet domain