Music in Use: Novel perspectives on content-based music Retrieval

Karthik Yadati

Research output: ThesisDissertation (TU Delft)

154 Downloads (Pure)


Music consumption has skyrocketed in the past few years with advancements in internet and streaming technologies. This has resulted in the rapid development of the inter-disciplinary field of Music Information Retrieval (MIR), which develops automatic methods to efficiently and effectively access the wealth of musical content. In general, research in MIR has focused on tasks like semantic filtering, annotation, classification and search. Observing the evolution of MIR over the years, research in this field has been focusing on “what music is” and in this thesis we move towards building tools that can analyse “what music does” to the listener. There is little research on building systems that analyse how music affects the listener or how people use music to suit their needs. In this thesis, we propose methods that push the boundaries of this perspective. The first major part of the thesis focuses on detecting high-level events in music tracks. Research on event detection in music has been restricted to detecting low-level events viz., onsets. There is also an abundance of literature on music auto-tagging, where researchers have focused on adding semantic tags to short music snippets. However, we look at the problem of event detection from a different perspective and turn to social music sharing platform – SoundCloud to understand what events are of importance to the actual listeners. Using a case-study in Electronic Dance Music (EDM), we design an approach to detect high-level events in music. The high-level events in our case-study have a certain impact on the listeners causing them to comment about these events on SoundCloud. Through successful experiments, we demonstrate how these high-level events can be detected efficiently using freely available but noisy user comments. The results of this approach inspired us for further research to investigate other tasks that can give us more insight into how music affects the listener. The second major part of the thesis concerns identifying music that can support different common activities – working, studying, relaxing, working out etc. A certain type of music is suitable for enabling listeners to perform a certain task. We first investigate what activities are important from a listeners’ perspective, for which music is sought, through a data-driven experiment on YouTube. After illustrating how existing music metadata like genre, instrument is insufficient, we propose a method that can successfully classify music based on the activity categories. An important insight from our experiments is that dividing the music track into short frames is not an effective method of feature extraction for activity-based music classification. This task requires a longer time window for feature extraction. Additionally, presence of high-level events like drop can affect the classification performance. After successful validation of our idea on activity-based music classification, we went on to investigate what can potentially distract a listener while doing a task. For this, we gathered valuable input from users of Amazon Mechanical Turk (AMT) on what musical characteristics distract them while doing their tasks. Based on this input, we built a system that can automatically detect a derail moment in a given music track, where the listener could potentially get distracted (derailed). Though this task seems to have a likely subjective component, we demonstrated that there are universal aspects to it as well. Through a literature survey and computational experiments, we demonstrate that we can automatically detect a derail moment. Throughout the thesis, we also stress on the importance of crowdsourcing platforms like AMT and social media sharing platforms like SoundCloud, and YouTube in understanding the user’s requirements and gathering data. We believe that our proposed methods and their outcomes will encourage future researchers to focus on this breed of MIR tasks, where the focus is on how music affects the listener. We also hope that the insights gained through this thesis will inspire designers and developers to build novel user interfaces to enable effective access of music.
Original languageEnglish
Awarding Institution
  • Delft University of Technology
  • Hanjalic, A., Supervisor
  • Liem, C.C.S., Advisor
Award date15 May 2019
Print ISBNs978-94-6375-416-3
Publication statusPublished - 2019


  • music as technology
  • music for activities
  • music event detection

Cite this