Researchers have long struggled to effectively represent and generate Boundary Representation (B-rep) models, typically relying on complex, graph-based methods that separate geometry and topology. Now, Jiahao Li, Yunpeng Bai (National University of Singapore), and Yongkang Dai (Northwestern Polytechnical University), alongside Guo et al, present a novel approach , BrepARG , which uniquely encodes B-rep geometry and topology into a single, holistic token sequence. This breakthrough enables the application of powerful sequence-based generative frameworks, previously inaccessible to B-rep modelling, and demonstrably achieves state-of-the-art performance. By representing the entire B-rep as a hierarchical sequence of tokens, BrepARG not only validates a new feasibility for B-rep generation but also opens exciting avenues for future research in this field.
Specifically, BrepARG employs a hierarchical tokenisation process, creating three distinct token types: geometry and position tokens capturing geometric features, and face index tokens representing topological connections. The team achieved holistic sequence construction by first building geometry blocks, representing faces and edges, using these tokens, followed by sequencing these blocks with a topology-aware scheme. This meticulous process culminates in a complete, holistic sequence representation of the entire B-rep model.
This breakthrough reveals a significant departure from existing methods that rely on stage-wise learning or multi-component architectures, which often lead to fragmented representations and increased model complexity. By holistically representing B-rep data, BrepARG eliminates the need for these complex pipelines, enabling the model to capture the full heterogeneity and interdependence inherent in B-rep structures. The work opens possibilities for automating and enhancing CAD modelling processes, offering a more efficient and streamlined approach to solid model generation. Training BrepARG required approximately 1.2 days using 4 NVIDIA H20 GPUs, while inference on a single RTX 4090 takes around 1.5 seconds per B-rep, demonstrating its practical efficiency.
Furthermore, the study proves the effectiveness of the autoregressive framework in co-generating geometric shapes and topological connections in a single, unified stream. The researchers detail a novel uniform scalar quantization algorithm for encoding 3D positions into Position Tokens, and a vector-quantized variational autoencoder (VQ-VAE) for generating Geometry Tokens from UV-sampled geometric primitives. These technical innovations contribute to the overall robustness and performance of the BrepARG framework, solidifying its position as a leading solution for B-rep generation.
Scientists Method
The core innovation lies in representing the complete B-rep geometry and topology as a single token sequence, facilitating direct autoregressive modeling and eliminating fragmented representations common in prior approaches. Researchers developed a three-token system for encoding B-rep data: Geometry Tokens representing geometric features, Position Tokens encoding 3D positions via a newly designed uniform scalar quantization algorithm, and Face Index Tokens capturing topological information. Geometry and topology of faces and edges are discretised separately, with geometry primitives UV-sampled and tokenized by mapping variational autoencoder (VQ-VAE) latents to codebook indices using nearest neighbour search. Subsequently, the team constructed geometry blocks, each comprising all three token types and representing a single face or edge, forming the basis for sequence construction.
Experiments employed a topology-aware sequentialization scheme to arrange face and edge blocks, enforcing causal ordering while preserving local structural relationships, a critical step in maintaining B-rep integrity. The face and edge block sequences were then assembled with necessary markers to create the final holistic sequence representation, ready for autoregressive modeling. To leverage this representation, scientists implemented a multi-layer decoder-only transformer with causal masking, training it to predict the next token in the sequence and learn the joint distribution of geometric and topological elements. This autoregressive framework enables co-generation of shapes and connections in a single stream, achieving end-to-end B-rep sequence generation.
BrepARG Achieves 87.6% Validity on DeepCAD datasets
The team measured performance using several key metrics, including Coverage (COV), Maximum Mean Discrepancy (MMD), Jensen-Shannon Divergence (JSD), Novelty, Uniqueness, and Validity, to rigorously evaluate the generated B-rep models. These measurements confirm the model’s superior performance compared to baseline methods like DeepCAD, BrepGen, and DTGBrepGen. The research team also investigated training and inference efficiency, revealing that BrepARG requires only 1.2 days for training using four H20 GPUs, compared to 7.5 days for BrepGen and 3.0 days for DTGBrepGen. Inference time with a single RTX4090 GPU was measured at 1.5 seconds for BrepARG, significantly faster than the 8.4 seconds for BrepGen and 3.6 seconds for DTGBrepGen.
Further analysis, using nucleus sampling with varying ‘p’ values, showed that adjusting this parameter allows for flexible control over the diversity and validity of generated models, a p value of 0.9 yielded the best overall results. Class-conditioned generation was successfully demonstrated by prefixing the input sequence with a class-specific token during training and inference, enabling the model to generate B-reps tailored to specific categories, such as furniture. These findings highlight the importance of considering topological relationships when constructing the holistic token sequence representation.,.
BrepARG unlocks transformer-based B-rep generation with unprecedented control
This approach reformulates B-rep generation as a sequence modelling task, enabling the application of sequence-based generative architectures like transformers, previously unavailable for B-rep, to jointly learn geometric details and topological constraints in a unified process. Experiments reveal that a depth-first search-based traversal combined with a maximum index assignment strategy effectively captures local connectivity, leading to more coherent geometry and stable generation. The authors acknowledge limitations related to the complexity of modelling highly intricate B-rep models and the computational demands of autoregressive models. Future work could explore methods to improve the efficiency of the autoregressive process and extend the framework to handle more complex B-rep structures. This advancement paves the way for new directions in generative B-rep modelling, potentially reducing multi-stage errors and computational overhead in design and manufacturing applications.