Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception

Kun Yang*, Dingkang Yang*, Jingyu Zhang, Mingcheng Li, Yang Liu, Jing Liu, Hanqi Wang, Peng Sun, Liang Song
Academy for Engineering and Technology, Fudan University

Duke Kunshan University

Contact us: {kunyang20,dkyang20}[at]fudan.edu.cn

The overall architecture of the proposed SCOPE. The framework consists of five parts: metadata conversion and feature extraction, context-aware information aggregation, confidence-aware cross-agent collaboration, importance-aware adaptive fusion, and detection decoders.

Abstract

Multi-agent collaborative perception as a potential application for vehicle-to-everything communication could significantly improve the perception performance of autonomous vehicles over single-agent perception. However, several challenges remain in achieving pragmatic information sharing in this emerging research. In this paper, we propose SCOPE, a novel collaborative perception framework that aggregates the spatio-temporal awareness characteristics across on-road agents in an end-to-end manner. Specifically, SCOPE has three distinct strengths: i) it considers effective semantic cues of the temporal context to enhance current representations of the target agent; ii) it aggregates perceptually critical spatial information from heterogeneous agents and overcomes localization errors via multi-scale feature interactions; iii) it integrates multi-source representations of the target agent based on their complementary contributions by an adaptive fusion paradigm. To thoroughly evaluate SCOPE, we consider both real-world and simulated scenarios of collaborative 3D object detection tasks on three datasets. Extensive experiments demonstrate the superiority of our approach and the necessity of the proposed components.

Qualitative Results

Qualitative comparison results in real-world scenarios from the DAIR-V2X dataset. Green and red boxes denote ground truths and detection results, respectively. Compared to the previous SOTA models, our method achieves more accurate detection results.

BibTeX

@article{yang2023spatio,
          title={Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception},
          author={Yang, Kun and Yang, Dingkang and Zhang, Jingyu and Li, Mingcheng and Liu, Yang and Liu, Jing and Wang, Hanqi and Sun, Peng and Song, Liang},
          journal={arXiv preprint arXiv:2307.13929},
          year={2023}
        }