Scalable Unseen Object 6-DoF Absolute Pose Estimation with Robotic Integration

1Hunan University, 2Nanyang Technological University, 3Central South University, 4National University of Singapore, 5Lacaster University, 6The University of Western Australia
TRO 2026

SinRef-6D Task Setup and Robotic Integration

Given a single RGB-D reference view of an unseen object in a default robot manipulation viewpoint, we aim to predict its 6-DoF absolute pose from any query view. (a) and (b) compare two types of manual reference view-based methods. (a) Dense reference views-based methods typically rely on 1): 3D object reconstruction or 2): template matching, which is time- and memory-consuming (not suitable for robotic applications). (b) The proposed method estimates unseen object pose from only a single reference view, providing enhanced efficiency and scalability (suitable for robotic applications). (c) and (d) are the detailed hardware and software architectures of our integrated robotic system. We also develop an efficient semi-automatic annotator based on the proposed task setup, enabling single reference annotation for unseen object within one minute.



Framework Overview

SinRef-6D comprises four modules: (A) The reference view is labeled via a semi-automatic annotator, then the RGB-D images of the reference and query views are segmented, and the segmented depth maps are back-projected into point clouds. (B) The corresponding point clouds of the reference and query views are focalized from the camera coordinate system to the object coordinate system. (C) Leveraging the proposed Points and RGB SSMs, features are extracted from the focalized point clouds and RGB images, forming point-wise reference and query features. (D) These features are then used to establish point-wise alignment for pose solving. Finally, the computed pose is fed back into module (B) to iteratively improve the accuracy of point-wise alignment, yielding a more precise object pose.



Real-world Qualitative Experiments




Real-world Robotic Applications

The objects selected for experiment (right are the objects for grasping).


Additional Qualitative Experiments