TL;DR: We generate 3D shapes by iteratively splitting bounding boxes from coarse to fine, enabling intuitive control over part granularity and interactive shape editing.
Human creativity follows a perceptual process, moving from abstract ideas to finer details during creation. While 3D generative models have advanced dramatically, models specifically designed to assist human imagination in 3D creation—particularly for detailing abstractions from coarse to fine—have not been explored. We propose a framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes. The main technical components of our framework are two generative models: the box-splitting generative model and the box-to-shape generative model. The first model, named BoxSplitGen, generates a collection of 3D part bounding boxes with varying granularity by iteratively splitting coarse bounding boxes. The second model, the box-to-shape generative model, is trained by leveraging the 3D shape priors learned by an existing 3D diffusion model while adapting the model to incorporate bounding box conditioning.
Instead of generating all parts at once, we iteratively split a single bounding box into finer parts. This creates a natural hierarchy from coarse to fine, enabling exploration of the design space at every level.
Stage 1 (BoxSplitGen): A classifier decides which box to split, then a diffusion model generates the two child boxes.
Stage 2 (Box-to-Shape): A conditioned 3D diffusion model generates detailed meshes that fit within the boxes.
Users can stop at any granularity level. Want a general chair shape? Use fewer boxes. Need precise control over armrests and legs? Keep splitting for more detail.
Since shapes are defined by bounding boxes, users can directly manipulate parts: rotate a chair's backrest, stretch table legs, or resize airplane wings.
Given part bounding boxes (left), our model generates diverse and high-quality 3D shapes (right) that faithfully follow the box layout.
Spice-E: Sella et al., ACM SIGGRAPH 2024. | 3DS2VS: Zhang et al., ACM ToG 2023.
Our BoxSplitGen model generates semantically meaningful part bounding boxes through iterative splitting.
Airplane
Chair
Couch
Lamp
Rifle
Table
Users can interactively edit 3D shapes by transforming bounding boxes.
Chair back rotation
Airplane wing adjustment
Sofa arm modification
@inproceedings{koo2026boxsplitgen,
title={BoxSplitGen: A Generative Model for 3D Part Bounding Boxes in Varying Granularity},
author={Juil Koo and Wei-Tung Lin and Chanho Park and Chanhyeok Park and Minhyuk Sung},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year={2026}
}
This work was supported by the IITP grants (RS-2022-00156435, RS-2024-00399817, RS-2025-25441313, RS-2025-25443318, RS-2025-02653113); and the Industrial Technology Innovation Program (RS-2025-02317326), all funded by the Korean government (MSIT and MOTIE), as well as by the DRB-KAIST SketchTheFuture Research Center.