[论文]刘复昌.Recovering 6D object pose from RGB indoor image based on two-stage detection network with multi-task loss
Fuchang Liu, Pengfei Fang, Zhengwei Yao, Ran Fan, Zhigeng Pan, Weiguo Sheng, Huansong Yang,
Recovering 6D object pose from RGB indoor image based on two-stage detection network with multi-task loss,
Neurocomputing,
Volume 337,
2019,
Pages 15-23,
ISSN 0925-2312,
//doi.org/10.1016/j.neucom.2018.12.061.
(//www.sciencedirect.com/science/article/pii/S0925231218315236)
Abstract: Object pose estimation from an RGB image has recently become a common problem owing to its widespread application. The advent of convolutional neural networks has impelled significant progress in object detection. However, most available methods do not involve a category-level pose estimation approach. This study presents an end-to-end 6D category-level pose estimation based on a two-stage bounding-box recognition backbone architecture. Our network directly outputs the 6D pose without requiring multiple stages or additional post-processing such as a Perspective-n-Point (PnP). The two-stage CNN architecture and our loss function render multi-task joint training effective and efficient. We improve the pose estimation accuracy by replacing fully connected layers with fully convolutional layers. Fully convolutional networks require fewer parameters and are less susceptible to overfitting. Moreover, we transform the pose estimation problem into classification and regression tasks using our network; these are called Pose-cls and Pose-reg, respectively. We also present qualitative and quantitative results on real data from the SUN RGB-D dataset. The experiments demonstrate the effectiveness of our algorithms compared to other state-of-the-art methods.
Keywords: Pose estimation; Two-stage detection; Convolutional neural networks; Multi-task loss