DecomPose: Disentangling Cross-Category Optimization Contention for Category-Level 6D Object Pose Estimation

Gao, Yifan; Zou, Lu; Huang, Zhangjin; Wang, Guoping

DecomPose: Disentangling Cross-Category Optimization Contention for Category-Level 6D Object Pose Estimation

Yifan Gao^1,2, Lu Zou¹, Zhangjin Huang¹, Guoping Wang³

¹Hubei Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan, Hubei, China
²University of Science and Technology of China, Hefei, Anhui, China
³Peking University, Beijing, China

ICML 2026

Paper Code arXiv

**Motivation of DecomPose.** (a) Category-wise complexity ranking derived from AG-Pose (Lin et al., 2024) evaluation scores, ordering categories from complex to simple using the 5°2cm metric. (b) Cross-category optimization contention: mismatched modeling demands lead to gradient conflicts, while asynchronous convergence induces negative transfer, as gradients from hard categories continually perturb parameters preferred by easy ones during training.

Abstract

Category-level 6D object pose estimation is typically formulated as a multi-category joint learning problem with fully shared model parameters. However, pronounced geometric heterogeneity across categories entangles incompatible optimization signals in shared modules, resulting in gradient conflicts and negative transfer during training. To address this challenge, we first introduce gradient-based diagnostics to quantify module-level cross-category contention. Building on these diagnostics, we propose DecomPose, a difficulty-aware decomposition framework that mitigates optimization contention via: (1) difficulty-aware gradient decoupling, which groups categories using a data-driven difficulty proxy and routes each instance to a group-specific correspondence branch to isolate incompatible updates; and (2) stability-driven asymmetric branching, which assigns higher-capacity branches to structurally simple categories as stable optimization anchors while constraining complex categories with lightweight branches to suppress noisy updates. Extensive experiments on REAL275, CAMERA25, and HouseCat6D demonstrate that DecomPose effectively reduces cross-category optimization contention and delivers superior pose estimation performance across multiple benchmarks.

Method

Experiment

Table 1. Comparison with state-of-the-art methods on CAMERA25, REAL275, and HouseCat6D. Best results are highlighted in bold (red), and second-best results are underlined (blue). ‘-’ indicates that results are not reported.

BibTeX

@misc{gao2026decomposedisentanglingcrosscategoryoptimization,
  title={DecomPose: Disentangling Cross-Category Optimization Contention for Category-Level 6D Object Pose Estimation},
  author={Yifan Gao and Lu Zou and Zhangjin Huang and Guoping Wang},
  year={2026},
  eprint={2605.15728},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2605.15728}
}