Visual-Saliency Guided Multi-modal Learning for No Reference Point Cloud Quality Assessment

Xuemei Zhou, Irene Viola, Ruihong Yin, Pablo Cesar

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

As 3D immersive media continues to gain prominence, Point Cloud Quality Assessment (PCQA) is essential for ensuring high-quality user experiences. This paper introduces ViSam-PCQA, a no-reference PCQA metric guided by visual saliency information across three modalities, which facilitates the performance of the quality prediction. Firstly, we project the 3D point cloud to acquire 2D texture, depth, and normal maps. Secondly, we extract the saliency map based on the texture map and refine it with the corresponding depth map. This refined saliency map is used to weight low-level feature maps to highlight perceptually important areas in the texture channel. Thirdly, high-level features from the texture, normal, and depth maps are then processed by a Transformer to capture global and local point cloud representations across the three modalities. Lastly, saliency along with global and local embeddings, are concatenated and processed through a multi-task decoder to derive the final quality scores. Our experiments on the SJTU, WPC, and BASICS datasets show high Spearman rank order correlation coefficients/Pearson linear correlation coefficients of 0.953/0.962, 0.920/0.920 and 0.887/0.936 respectively, demonstrating superior performance compared to current state-of-the-art methods.
Original languageEnglish
Title of host publicationQoEVMA'24
Subtitle of host publicationProceedings of the 3rd Workshop on Quality of Experience in Visual Multimedia Applications
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery (ACM)
Pages39-47
Number of pages9
ISBN (Electronic)979-8-4007-1204-3
DOIs
Publication statusPublished - 2024
Event3rd Workshop on Quality of Experience in Visual Multimedia Applications - Melbourne, Australia
Duration: 28 Oct 20241 Nov 2024

Conference

Conference3rd Workshop on Quality of Experience in Visual Multimedia Applications
Abbreviated titleQoEVMA 2024
Country/TerritoryAustralia
CityMelbourne
Period28/10/241/11/24

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • multi-modal
  • no reference
  • point cloud quality assessment
  • projection
  • visual saliency

Fingerprint

Dive into the research topics of 'Visual-Saliency Guided Multi-modal Learning for No Reference Point Cloud Quality Assessment'. Together they form a unique fingerprint.

Cite this