Cross-View Matching for Vehicle Localization by Learning Geographically Local Representations

Zimin Xia, Olaf Booij, Marco Manfredi, Julian Kooij

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)
22 Downloads (Pure)


Cross-view matching aims to learn a shared image representation between ground-level images and satellite or aerial images at the same locations. In robotic vehicles, matching a camera image to a database of geo-referenced aerial imagery can serve as a method for self-localization. However, existing work on cross-view matching only aims at global localization, and overlooks the easily accessible rough location estimates from GNSS or temporal filtering. We argue that the availability of coarse location estimates at test time should already be considered during training. We adopt a simple but effective adaptation to the common triplet loss, resulting in an image representation that is more discriminative within the geographically local neighborhood, without any modifications to a baseline deep neural network. Experiments on the CVACT dataset confirm that the improvements generalize across spatial regions. On a new benchmark constructed from the Oxford RobotCar dataset, we also show generalization across recording days within the same region. Finally, we validate that improvements on these image-retrieval benchmarks also translate to a real-world localization task. Using a particle filter to fuse the cross-view matching scores of a vehicle's camera stream with real GPS measurements, our learned geographically local representation reduces the mean localization error by 17\% compared to the standard global representation learned by the current state-of-the-art.

Original languageEnglish
Pages (from-to)5921-5928
JournalIEEE Robotics and Automation Letters
Issue number3
Publication statusPublished - 2021

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.


  • Benchmark testing
  • Global navigation satellite system
  • Intelligent Transportation Systems
  • Localization
  • Location awareness
  • Representation Learning
  • Satellites
  • Sensors
  • Task analysis
  • Training


Dive into the research topics of 'Cross-View Matching for Vehicle Localization by Learning Geographically Local Representations'. Together they form a unique fingerprint.

Cite this