ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

4 Citations (Scopus)
195 Downloads (Pure)


Street-level imagery contains a variety of visual information about the facades of Points of Interest (POIs). In addition to general mor- phological features, signs on the facades of, primarily, business-related POIs could be a valuable source of information about the type and iden- tity of a POI. Recent advancements in computer vision could leverage visual information from street-level imagery, and contribute to the classification of POIs. However, there is currently a gap in existing literature regarding the use of visual labels contained in street-level imagery, where their value as indicators of POI categories is assessed. This paper presents Scene-Text Semantics (ST-Sem), a novel method that leverages visual la- bels (e.g., texts, logos) from street-level imagery as complementary in- formation for the categorization of business-related POIs. Contrary to existing methods that fuse visual and textual information at a feature- level, we propose a late fusion approach that combines visual and textual cues after resolving issues of incorrect digitization and semantic ambiguity of the retrieved textual components. Experiments on two existing and a newly-created datasets show that ST-Sem can outperform visual-only approaches by 80% and related multimodal approaches by 4%.
Original languageEnglish
Title of host publicationWeb Engineering - 19th International Conference, ICWE 2019, Proceedings
EditorsMaxim Bakaev, In-Young Ko, Flavius Frasincar
Place of PublicationCham
Number of pages15
ISBN (Electronic)978-3-030-19274-7
ISBN (Print)978-3-030-19273-0
Publication statusPublished - 26 Apr 2019
Event19th International Conference on Web Engineering, ICWE 2019 - Daejeon Convention Center (DCC), Daejeon, Korea, Republic of
Duration: 11 Jun 201914 Jun 2019
Conference number: 19

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference19th International Conference on Web Engineering, ICWE 2019
Abbreviated titleICWE 2019
Country/TerritoryKorea, Republic of


  • Convolutional neural networks
  • Points of Interest
  • Semantic similarity
  • Street-level imagery
  • Word embeddings


Dive into the research topics of 'ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery'. Together they form a unique fingerprint.

Cite this