content-based music retrieval system for ...

14 downloads 0 Views 5MB Size Report
christian[email protected] ... about 10,000 sound carriers of modern African music ... explore an music archive without using your ears.
MICHAEL BLAß AND ROLF BADER

CONTENT-BASED MUSIC RETRIEVAL SYSTEM FOR ETHNOMUSICOLOGICAL SOUND ARCHIVES

Preservation All photos by Christian Koehn,
 [email protected]

Exploration All photos by Christian Koehn,
 [email protected]

1. EXAMPLE APPROACHES TO
 ETHNOGRAPHIC SOUND ARCHIVES

1. The classical approach: •

Archiv für die Musik Afrikas (AMA) in Mainz, Germany •

catalogue based



about 10,000 sound carriers of modern African music



digitization in progress



+ exploration



- not feasible for big archives

We argue, that even though digital musicology already claims to be bridging a gap between “fieldwork ethnomusicologists” and “big data, experts”, that a combination, the best of both worlds may be a profitable way to go. You cannot explore an music archive without using your ears.

1. EXAMPLE APPROACHES TO
 ETHNOGRAPHIC SOUND ARCHIVES

2. The digital musicology approach: •

Digital Music Lab (DML), London, U.K. •

purely feature based



analyze and compare songs or even 
 whole archives on a variety of sound 
 features



read only, no sound

1. EXAMPLE APPROACHES TO
 ETHNOGRAPHIC SOUND ARCHIVES ➡ We need the best of both worlds
 1. Catalogue / text based search and retrieval engine •

online access



access to audio


2. Content-based, automatic archive organization •

sound features



music similarity

2. Ethnographic sound recordings Archive (ESRA) University of hamburg

2. Ethnographic sound recordings Archive (ESRA) University of hamburg



The Wilhelm Heinz Collection of African Music •

350 piece



mostly gramophone records



1916 – 1948



38 regions



49 ethnics


Yildiz Quantasi Thessaloniki, ca. 1920

2. Ethnographic sound recordings Archive (ESRA) University of hamburg



The Cairo Congress of Arab Music •

103 piece



March–April 1932



Algeria, Tunisia, Iraq, Turkey, Syria

Title: ya Muqabil Performer: al-Gawq al-Gaza'ir 1932

2. Ethnographic sound recordings Archive (ESRA) University of hamburg



Rolf Bader Field Recordings •

118 piece



2010 until now



Myanmar, Sri Lanka, Cambodia

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES



Meta data

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES PLENT Y OF CL ASSES •

Meta data



Plenty of classes



Unbalanced data

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES UNBAL ANCED DATA

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES UNBAL ANCED DATA

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES



Meta data



Plenty of classes



Unbalanced data



heterogenous data

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES



Meta data



Plenty of classes



Unbalanced data



heterogenous data

3. CHALLENGES IN ETHNOGRAPHIC SOUND ARCHIVES



Meta data



Plenty of classes



Unbalanced data



heterogenous data



noise

4. COMSAR COMPUTATIONAL MUSIC AND SOUND ARC HIVING PROJECT



Provide rich meta data structure



Visualization by music similarity

COMSAR COMPUTATIONAL MUSIC AND SOUND ARC HIVING PROJECT



extract low-level timbral feature



aggregate to song-level rhythm feature



visualization mutual similarity

COMSAR RHY THM FEATURE

(Blaß, 2013a; Blaß 2013b)

COMSAR RHY THM FEATURE



Identical in IOI



different in timbre



Can be generalized to polyphonic timbre

(Blaß, 2013a; Blaß 2013b)

TIME DOMAIN SIGNAL AND SPECTROGRAM Section: 0 Hz – 10 kHz

STFT parameters: n_fft = 1024, hop = 512, Hamming window

TIMBRE FEATURE SPECTRAL CENTROID OR MFCC? •

Mel frequency cepstral coefficients (MFCC) •



Primary feature for timbre related computations

Spectral Centroid •

Multidimensional scaling (MDS) of dissimilarity ratings
 for example, Grey (1977); Wessel (1979); Inverson & Krumhansel (1993)



MDS of physical signal parameters
 for example, Lakatos (2000)



Adjective rating studies
 for example, von Bismark (1977)

FEATURE EXTRACTION TIMBRE SPACE FROM GREY (197 7)

FEATURE EXTRACTION TIMBRE SPACE FROM GREY (197 7)

FEATURE EXTRACTION SPECTRAL CENTROID AS MODEL FOR TIMBRE •

Previous studies confirmed Spectral Centroid as main perceptual dimension.



Spectral Centroid correlates well with perception of brightness (Schubert et al, 2004).



Mel-frequecy cepstral coefficients do not seem to correlate with any perceptual dimension (Alluri & Toiviainen, 2009; Siedenburg et al., 2017).


SYSTEM OVERVIEW 🎶 Source

Preprocessing

Feature extraction

Clustering/ Visualization



✄ SC

ONSET DETECTION BASED ON SPECTRAL FLUX

SF (n) =

PK

1 k=0

H (|X(n, k)| |X(n PK 1 k=0 |X(n, k)|

1, k)|)

x + |x| H(x) = 2

X = Spectrum n = Number STFT window k = Number frequency bin

SYSTEM OVERVIEW 🎶 Source

Preprocessing

Feature extraction

Clustering/ Visualization



✄ SC

TIMBRE FEATURE SPECTRAL CENTROID

SC(n) =

K X1

f (k) p(n, k)

k=0

X(n, k)

p(n, k) = PK

1 k=0

X(n, k)

X = Spectrum n = Number STFT window k = Number frequency bin

SYSTEM OVERVIEW 🎶 Source

Preprocessing

Feature extraction

Clustering/ Visualization



✄ SC

CONCLUSION



Ethnographic sound archives have a strong potential that goes far beyond mere conservation.



Currently there is a lot of effort to utilize this potential.



Classical methods support the exploration approach but are not adequate for big archives



Big data methods are abstract

CONCLUSION





We propose a data drive (content-based) system that can support ethnomusicological research by … •

ordering existing music archives



finding new hypotheses

The system consists of 1. Hidden Markov Model (HMM)-based feature extraction, which models rhythm in terms of occurring polyphonic timbres. 2. Self-Organizing Map (SOM) projection, which orders the HMMs to rhythmically similar clusters.

CONCLUSION •



Evaluation is very difficult: •

Archives would have to be annotated in order the measure
 onset detection performance.



There is not target data about rhythm



SOM is a tool for explorative data analysis.

Evaluation by expert users.

VISIT AND TRY ESRA

IT’S FREE

http://esra.fbkultur.uni-hamburg.de

THANK YOU for your attention.

“Content-based music retrieval system for ethnomusicological sound archives” Michael Blaß [email protected]

Rolf Bader [email protected]

Visit ESRA http://esra.fbkultur.uni-hamburg.de

We appreciate your feedback, especially regarding ESRA. Michael Blaß, M.A.

Prof. Dr. Rolf Bader

[email protected] http://www.uni-hamburg.de/ifsm

[email protected] http://www.uni-hamburg.de/ifsm

Institute for Systematic Musicology University of Hamburg Neue Rabenstraße 13 20354 Hamburg Phone: +49 40 42838-5786

Suggest Documents