BoxSplitGen: A Generative Model for 3D Part Bounding Boxes in Varying Granularity

Abstract

Human creativity follows a perceptual process, moving from abstract ideas to finer details during creation. While 3D generative models have advanced dramatically, models specifically designed to assist human imagination in 3D creation—particularly for detailing abstractions from coarse to fine—have not been explored. We propose a framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes. The main technical components of our framework are two generative models: the box-splitting generative model and the box-to-shape generative model. The first model, named BoxSplitGen, generates a collection of 3D part bounding boxes with varying granularity by iteratively splitting coarse bounding boxes. The second model, the box-to-shape generative model, is trained by leveraging the 3D shape priors learned by an existing 3D diffusion model while adapting the model to incorporate bounding box conditioning.

Overview of our box-splitting-based 3D shape generative framework. The left shows our iterative box splitting and box-to-shape generation, where diverse shapes at the top of the tree become increasingly specific deeper in the tree. The right showcases our user-interactive box and shape editing demo.

Method Overview

Hierarchical Box Splitting

Instead of generating all parts at once, we iteratively split a single bounding box into finer parts. This creates a natural hierarchy from coarse to fine, enabling exploration of the design space at every level.

Two-Stage Framework

Stage 1 (BoxSplitGen): A classifier decides which box to split, then a diffusion model generates the two child boxes.
Stage 2 (Box-to-Shape): A conditioned 3D diffusion model generates detailed meshes that fit within the boxes.

Coarse-to-Fine Control

Users can stop at any granularity level. Want a general chair shape? Use fewer boxes. Need precise control over armrests and legs? Keep splitting for more detail.

Interactive Editing

Since shapes are defined by bounding boxes, users can directly manipulate parts: rotate a chair's backrest, stretch table legs, or resize airplane wings.

Results

Box-to-Shape Generation

Given part bounding boxes (left), our model generates diverse and high-quality 3D shapes (right) that faithfully follow the box layout.

Input Spice-E 3DS2VS Ours

Input Spice-E 3DS2VS Ours

1 / 4

Spice-E: Sella et al., ACM SIGGRAPH 2024. | 3DS2VS: Zhang et al., ACM ToG 2023.

Box Generation (Stage 1)

Our BoxSplitGen model generates semantically meaningful part bounding boxes through iterative splitting.

Token Uncond Cond

Airplane

Token Uncond Cond

Chair

Token Uncond Cond

Couch

Token Uncond Cond

Lamp

Token Uncond Cond

Rifle

Token Uncond Cond

Table

Interactive Shape Editing

Users can interactively edit 3D shapes by transforming bounding boxes.

Chair back rotation

Airplane wing adjustment

Sofa arm modification

BibTeX

@inproceedings{koo2026boxsplitgen,
    title={BoxSplitGen: A Generative Model for 3D Part Bounding Boxes in Varying Granularity},
    author={Juil Koo and Wei-Tung Lin and Chanho Park and Chanhyeok Park and Minhyuk Sung},
    booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2026}
}

This work was supported by the IITP grants (RS-2022-00156435, RS-2024-00399817, RS-2025-25441313, RS-2025-25443318, RS-2025-02653113); and the Industrial Technology Innovation Program (RS-2025-02317326), all funded by the Korean government (MSIT and MOTIE), as well as by the DRB-KAIST SketchTheFuture Research Center.