arXiv:2403.13378 Abstract | arXiv Analytics

arXiv:2403.13378 [cs.CV]Abstract References Reviews Resources

IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis

Published 2024-03-20Version 1

Semantic image synthesis aims to generate high-quality images given semantic conditions, i.e. segmentation masks and style reference images. Existing methods widely adopt generative adversarial networks (GANs). GANs take all conditional inputs and directly synthesize images in a single forward step. In this paper, semantic image synthesis is treated as an image denoising task and is handled with a novel image-to-image diffusion model (IIDM). Specifically, the style reference is first contaminated with random noise and then progressively denoised by IIDM, guided by segmentation masks. Moreover, three techniques, refinement, color-transfer and model ensembles, are proposed to further boost the generation quality. They are plug-in inference modules and do not require additional training. Extensive experiments show that our IIDM outperforms existing state-of-the-art methods by clear margins. Further analysis is provided via detailed demonstrations. We have implemented IIDM based on the Jittor framework; code is available at https://github.com/ader47/jittor-jieke-semantic_images_synthesis.

Comments: 6 pages, 7 figures, accetped by CVMJ 2024

Categories: cs.CV

Keywords: segmentation masks, iidm outperforms existing state-of-the-art methods, novel image-to-image diffusion model, semantic image synthesis aims, adopt generative adversarial networks

Related articles: Most relevant | Search more

arXiv:2209.07547 [cs.CV] (Published 2022-09-15)

One-Shot Synthesis of Images and Segmentation Masks

Vadim Sushko, Dan Zhang, Juergen Gall, Anna Khoreva

arXiv:2408.03304 [cs.CV] (Published 2024-08-06)

Fusing Forces: Deep-Human-Guided Refinement of Segmentation Masks

Rafael Sterzinger, Christian Stippel, Robert Sablatnig

arXiv:2208.03142 [cs.CV] (Published 2022-08-05)

BoxShrink: From Bounding Boxes to Segmentation Masks