{ "id": "2211.09788", "version": "v1", "published": "2022-11-17T18:56:19.000Z", "updated": "2022-11-17T18:56:19.000Z", "title": "DiffusionDet: Diffusion Model for Object Detection", "authors": [ "Shoufa Chen", "Peize Sun", "Yibing Song", "Ping Luo" ], "comment": "Tech report. Code is available at https://github.com/ShoufaChen/DiffusionDet", "categories": [ "cs.CV" ], "abstract": "We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. The extensive evaluations on the standard benchmarks, including MS-COCO and LVIS, show that DiffusionDet achieves favorable performance compared to previous well-established detectors. Our work brings two important findings in object detection. First, random boxes, although drastically different from pre-defined anchors or learned queries, are also effective object candidates. Second, object detection, one of the representative perception tasks, can be solved by a generative way. Our code is available at https://github.com/ShoufaChen/DiffusionDet.", "revisions": [ { "version": "v1", "updated": "2022-11-17T18:56:19.000Z" } ], "analyses": { "keywords": [ "diffusion model", "formulates object detection", "object boxes diffuse", "diffusiondet achieves favorable performance", "effective object candidates" ], "tags": [ "github project" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }