Non-Overlap-Aware Egocentric Pose Estimation for Collaborative Perception in Connected Autonomy

IROS 2025, Hangzhou

Abstract

Egocentric pose estimation is a fundamental capability for multi-robot collaborative perception in connected autonomy, such as connected autonomous vehicles. During multi-robot operations, a robot needs to know the relative pose between itself and its teammates with respect to its own coordinates. However, different robots usually observe completely different views that contains similar objects, which leads to wrong pose estimation. In addition, it is unrealistic to allow robots to share their raw observations to detect overlap due to the limited communication bandwidth constraint. In this paper, we introduce a novel method for Non-Overlap-Aware Egocentric Pose Estimation (NOPE), which performs egocentric pose estimation in a multi-robot team while identifying the non-overlap views and satifying the communication bandwidth constraint. NOPE is built upon an unified hierarchical learning framework that integrates two levels of robot learning: (1) high-level deep graph matching for correspondence identification, which allows to identify if two views are overlapping or not, (2) low-level position-aware cross-attention graph learning for egocentric pose estimation. To evaluate NOPE, we conduct extensive experiments in both high-fidelity simulation and real-world scenarios. Experimental results have demonstrated that NOPE enables the novel capability for non-overlapping-aware egocentric pose estimation and achieves state-of-art performance compared with the existing methods.

A motivating scenario for egocentric pose estimation in
          connected autonomous driving. When two connected vehicles
          meet at an intersection, the ego vehicle must first estimate
          the pose of its teammate before merging its perception to
          enhance situational awareness. Meanwhile, it needs to address
          the challenges of limited communication bandwidth and non-
          overlapping views, where each vehicle observes a completely
          different perspective.
Figure 1: A motivating scenario for egocentric pose estimation in connected autonomous driving. When two connected vehicles meet at an intersection, the ego vehicle must first estimate the pose of its teammate before merging its perception to enhance situational awareness. Meanwhile, it needs to address the challenges of limited communication bandwidth and non- overlapping views, where each vehicle observes a completely different perspective.

Framework

Overview of our NOPE framework. NOPE represents the observation of each robot as a graph. The high-level NOPE performs CoID based on LVM-based deep graph matching. The identified correspondences are used to detect the overlapping views. The low-level of NOPE utilizes a position-aware cross-attention graph learning network to perform pose estimation between the ego robot and its teammate robot.

Overview of our NOPE framework.

CAD Results

Qualitative results on CoID and egocentric pose estimation from CAD.

Qualitative results on CoID and egocentric pose estimation from CAD.

Real-world CAD Results

Qualitative results on CoID and egocentric pose estimation from Real-world CAD.

Qualitative results on CoID and egocentric pose estimation from Real-world CAD.

Non-overlap Results

Comparisons of CoID for non-overlap detection.

Comparisons of CoID for non-overlap detection.