arXiv:2308.10040 Abstract | arXiv Analytics

arXiv:2308.10040 [cs.CV]Abstract References Reviews Resources

ControlCom: Controllable Image Composition using Diffusion Model

Bo Zhang, Yuxuan Duan, Jun Lan, Yan Hong, Huijia Zhu, Weiqiang Wang, Li Niu

Published 2023-08-19Version 1

Image composition targets at synthesizing a realistic composite image from a pair of foreground and background images. Recently, generative composition methods are built on large pretrained diffusion models to generate composite images, considering their great potential in image generation. However, they suffer from lack of controllability on foreground attributes and poor preservation of foreground identity. To address these challenges, we propose a controllable image composition method that unifies four tasks in one diffusion model: image blending, image harmonization, view synthesis, and generative composition. Meanwhile, we design a self-supervised training framework coupled with a tailored pipeline of training data preparation. Moreover, we propose a local enhancement module to enhance the foreground details in the diffusion model, improving the foreground fidelity of composite images. The proposed method is evaluated on both public benchmark and real-world data, which demonstrates that our method can generate more faithful and controllable composite images than existing approaches. The code and model will be available at https://github.com/bcmi/ControlCom-Image-Composition.

Categories: cs.CV

Keywords: foreground, controlcom, local enhancement module, large pretrained diffusion models, generate composite images

Related articles: Most relevant | Search more

arXiv:1909.08269 [cs.CV] (Published 2019-09-18)

Exploring Reciprocal Attention for Salient Object Detection by Cooperative Learning

Changqun Xia, Jia Li, Jinming Su, Yonghong Tian

arXiv:2109.08809 [cs.CV] (Published 2021-09-18)

HYouTube: Video Harmonization Dataset

Xinyuan Lu, Shengyuan Huang, Li Niu, Wenyan Cong, Liqing Zhang

arXiv:2009.09169 [cs.CV] (Published 2020-09-19)

BargainNet: Background-Guided Domain Translation for Image Harmonization

Wenyan Cong, Li Niu, Jianfu Zhang, Jing Liang, Liqing Zhang