This work gives a comprehensive overview of research on the multidisciplinary field of Music Information Retrieval (MIR).
MIR uses knowledge from areas as diverse as signal processing, machine learning, information and music theory. The Main Feature of this work is to explore how this knowledge can be used for the development of novel methodologies for browsing and retrieval on large music collections, a hot topic given recent advances in online music distribution and searching. Emphasis would be given to audio signal processing techniques.
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many realworld applications. Those involved in MIR may have a background in musicology, sychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these.
MIR is being used by businesses and academics to categorize, manipulate and even create music. One of the classical MIR research topics is genre classification, which is categorizing music items into one of pre-defined genres such as classical, jazz, rock, etc. Mood classification, artist classification, and music tagging are also popular topics.
Table of Contents
1. Introduction
Chapter 1- Review of Music Concepts
1.1 Literature Overview
1.2 Basic music elements
1.3 Music terminology
Chapter 2- Musical Information Retrieval
2.1 What is MIR?
2.2 Feature Extraction
2.2.1 Low-level similarity
2.2.2 Top-level similarity
2.2.3 Mid-level similarity
2.2.4 Process of Feature extraction
Chapter 3- Signal Analysis and Feature Extraction using Python
3.1 Why Python?
3.2 Basic Feature Extraction
3.2.1 Zero Crossing Rate
3.2.2 Fourier_transform using python
3.2.3 Short-Time Fourier Transform using python
3.2.4 Spectrogram
3.2.5 Mel-spectrogram
Conclusion
Reference
Objectives & Core Topics
The primary objective of this work is to provide a comprehensive overview of the multidisciplinary field of Music Information Retrieval (MIR) and to demonstrate practical methodologies for processing and extracting features from digital music collections using the Python programming language.
- Theoretical foundations of music analysis and signal processing.
- Taxonomy of MIR features, including low, mid, and top-level similarity.
- Implementation of signal analysis techniques using the Python library 'librosa'.
- Practical demonstration of feature extraction methods such as ZCR, STFT, and Mel-spectrograms.
Excerpt from the Book
2.2.1 Low-level similarity
Systems based on low-level acoustic similarity measures are usually intended to recognise a given recording under noisy conditions and despite high levels of signal degradation. These audio fingerprints represent audio as a collection of low-level feature sets mapped to a more compact representation by using a classification algorithm: Haitsma and Kalker use Fourier coefficients and quantization on logarithmically spaced sub-bands, Allamanche et al use MPEG-7 low-level spectral features, while Battle and Cano use Mel Frequency Cepstrum Coefficients (MFCC) followed by decoded Hidden Markov Models (HMMs) to produce the required labelling. These features can be extracted from the signal on a frame-by-frame basis and require little (if any) use of musical knowledge and theory. This technology has shown great success in detecting exact one-to-one correspondence between the audio query and a recording in the database[4], even when the query has been distorted by compression and background noise. It has seen commercial application in music recognition for end-users and in radio broadcast monitoring [4]. However, acoustic similarity measures disregard any correlation to the musical characteristics of the sound. As a result, two different recordings of the same song (even sharing performer and instrumentation) may not necessarily be near matches in a similarity-ranked list. Low-level similarity measures perform poorly in the retrieval of musically-relevant near matches.
Summary of Chapters
Chapter 1- Review of Music Concepts: This chapter provides an essential overview of musical fundamentals, including sound characteristics like pitch, intensity, and timbre, as well as core terminology relevant to MIR.
Chapter 2- Musical Information Retrieval: This chapter defines the scope of MIR and details the categorization of feature extraction techniques into low, mid, and top-level representations to bridge the semantic gap in music analysis.
Chapter 3- Signal Analysis and Feature Extraction using Python: This chapter focuses on the practical application of Python for signal processing, demonstrating how to extract audio features such as zero-crossing rates, Fourier transforms, and spectrograms.
Keywords
Music Information Retrieval, MIR, Feature Extraction, Signal Processing, Python, Librosa, Audio Classification, Zero Crossing Rate, Fourier Transform, Spectrogram, Mel-spectrogram, Acoustic Similarity, Digital Music, Spectral Analysis, Music Theory.
Frequently Asked Questions
What is the core focus of this research?
The work focuses on Music Information Retrieval (MIR), exploring the extraction of meaningful data from audio signals to enable better music management and classification.
What are the primary thematic fields covered?
The themes range from music theory and sound perception to digital signal processing techniques and the practical implementation of these tools in the Python programming language.
What is the research goal?
The primary goal is to explore how multidisciplinary knowledge can be applied to develop novel methodologies for browsing and retrieving information from large music collections.
Which scientific method is primarily employed?
The paper employs audio signal processing techniques and data analysis through computational algorithms, primarily using libraries like 'librosa' and 'scipy' to bridge the gap between low-level audio features and high-level musical labels.
What topics are discussed in the main section?
The main section covers basic music elements, the architecture of MIR systems, the categorization of spectral and temporal features, and hands-on coding implementations for analyzing drum samples and audio signals.
Which keywords best characterize this work?
Keywords include Music Information Retrieval, Signal Processing, Feature Extraction, Python, Spectrogram, and Acoustic Similarity.
Why is Python preferred for these MIR tasks?
Python is highlighted for its readability, extensive support libraries like NumPy and Pandas, its open-source nature, and its robust community support for scientific and mathematical computing.
How does the author explain the 'semantic gap' in MIR?
The 'semantic gap' refers to the discrepancy between low-level signal data (like frequency or amplitude) and the abstract, high-level labels humans use to categorize music, such as genre, mood, or style.
- Citar trabajo
- M. Sai Chaitanya (Autor), Dr. Soubhik Chakraborty (Autor), 2021, Musical information retrieval. Signal Analysis and Feature Extraction using Python, Múnich, GRIN Verlag, https://www.grin.com/document/1031816