The power of deep without going deep? A study of HDPGMM music representation learning

Jaehun Kim, C.C.S. Liem*

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

11 Downloads (Pure)

Abstract

In the previous decade, Deep Learning (DL) has proven to be one of the most effective machine learning methods to tackle a wide range of Music Information Retrieval (MIR) tasks. It offers highly expressive learning capacity that can fit any music representation needed for MIR-relevant downstream tasks. However, it has been criticized for sacrificing interpretability. On the other hand, the Bayesian nonparametric (BN) approach promises similar positive properties as DL, such as high flexibility, while being robust to overfitting and preserving interpretability. Therefore, the primary motivation of this work is to explore the potential of Bayesian nonparametric models in comparison to DL models for music representation learning. More specifically, we assess the music representation learned from the Hierarchical Dirichlet Process Gaussian Mixture Model (HDPGMM), an infinite mixture model based on the Bayesian nonparametric approach, to MIR tasks, including classification, auto-tagging, and recommendation. The experimental result suggests that the HDPGMM music representation can outperform DL representations in certain scenarios, and overall comparable.
Original languageEnglish
Title of host publicationProceedings of the 23rd International Society for Music Information Retrieval Conference
Pages116 - 124
Number of pages9
Publication statusPublished - 2022
Event23rd International Society for Music Information Retrieval Conference - Bengaluru, India
Duration: 4 Dec 20228 Dec 2022
Conference number: 23

Conference

Conference23rd International Society for Music Information Retrieval Conference
Abbreviated titleISMIR 2022
Country/TerritoryIndia
CityBengaluru
Period4/12/228/12/22

Fingerprint

Dive into the research topics of 'The power of deep without going deep? A study of HDPGMM music representation learning'. Together they form a unique fingerprint.

Cite this