Abstract
In this paper, we introduce a novel method for creating appearance embeddings to identify individual persons using an object re-identification (ReID) framework. We present CLFormer (Camera LiDAR Transformer), a transformer-based architecture that incorporates multi-modal data from both camera and LiDAR sensors. We introduce the 3D Cuboid-Inclusive Point Embedding (3D-CIPE), which leverages rich data from LiDAR point clouds and 3D cuboids to add a learnable embedding into the transformer structure. Additionally, through ablation studies, we explore and analyze various strategies for the early and late fusion of multi-modal input data. To evaluate our proposed CLFormer, we reinterpret the nuScenes dataset [1] for ReID purposes and use it for our experiments. Our method demonstrates a significant improvement in performance, outperforming the image-only baseline with an increase of 2.3 in mean Average Precision (mAP).
| Original language | English |
|---|---|
| Title of host publication | Proceedings od the 36th IEEE Intelligent Vehicles Symposium, IV 2025 |
| Publisher | IEEE |
| Pages | 1408-1414 |
| Number of pages | 7 |
| ISBN (Electronic) | 979-8-3315-3803-3 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 36th IEEE Intelligent Vehicles Symposium, IV 2025 - Cluj-Napoca, Romania Duration: 22 Jun 2025 → 25 Jun 2025 |
Publication series
| Name | IEEE Intelligent Vehicles Symposium, Proceedings |
|---|---|
| ISSN (Print) | 1931-0587 |
| ISSN (Electronic) | 2642-7214 |
Conference
| Conference | 36th IEEE Intelligent Vehicles Symposium, IV 2025 |
|---|---|
| Country/Territory | Romania |
| City | Cluj-Napoca |
| Period | 22/06/25 → 25/06/25 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/publishing/publisher-dealsOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.