SATIN: A Persistent Musical Database for Music Information Retrieval 15th International Workshop on Content-Based Multimedia Indexing (CBMI), June 19-21, 2017, Florence, Italy Yann Bayle 2nd Year PhD Student
[email protected]
Pierre Hanna Assistant professor
[email protected]
Introduction
Matthias Robine Assistant professor
[email protected]
Description of SATIN
Context Reproduce research in Music Information Retrieval (MIR) Bigger musical database Label tracks in a musical database Reduce computation time Manage copyright restrictions
Targeted applications Classification tasks in MIR Specifically Instrumentals and Songs Playlists creation Music recommendation
Description of SOFT1 First Set Of FeaTures extracted using SATIN
Set of Audio Tags and Identifiers Normalized 407,911 tracks from 194 countries Referenced by: − ISRC (International Standard Recording Code) − MusicBrainz id − Musixmatch id − SoundCloud id − Spotify id Ground truths for: − 106,466 Songs − 13,944 Instrumentals − 295 Genres − 194,290 tracks with at least one genre − 29,755 tracks with at least two genres Python API: − Figures − Lyrics − Metadata (artist, track,…) Reproducible research code, experiment, database and ground truths: https://github.com/ybayle/research
Sotwares
11% Instru
Essentia [1] Marsyas [2] YAAFE [3]
89% Songs
Features Essentia: more than 700 features at track scale Marsyas: features at track scale by frames of 23.2 ms [4]: − zero crossing rate − spectral centroid − roll-off and flux − 13 MFCC YAAFE: features at frame scale of 93.0 ms [4]: − 13 MFCC
Fig. 1. Geographical breakdown of the registration country according to the ISRC in SATIN.
Fig. 2. Songs and Instrumentals proportion in SATIN.
Use-case experiment Classification of Instrumentals and Songs
Parameters
Precision 0.889 ± 0.015
Fig. 3. Number of tracks by registration year according to the ISRC in SATIN.
10-fold cross-validation 13 MFCC per track five nearest neighbours scikit-learn Python package [5]
Recall 0.906 ± 0.009
F-score 0.891 ± 0.014
Fig. 4. Word cloud for the genres in SATIN.
Industrial partners Accuracy 0.906 ± 0.009 Metadata
Conclusion ISRC and tracks
SATIN Big database for classification of Instrumentals and Songs Large variety of tracks Reproducible experiments
SOFT1 Large variety of features Reproducible feature extraction Reusable features
Perspectives Gather more ground truths Find new sources for ground truths Compare the results of MIR algorithms Develop industrial partnerships
References [1] Dmitry Bogdanov, Nicolas Wack,Emilia Gómez, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Justin Salomon, José R. Zapata, and Xavier Serra. 2013. Essentia: an audio analysis library for music information retrieval. In Proc. 14th Int. Soc. Music Inform. Retrieval Conf. Curitiba, Brazil, 493–498. [2] George Tzanetakis and Perry Cook. 2000. Marsyas: A framework for audio analysis. Organised sound 4, 3 (Nov. 2000), 169–175. [3] Benoît Mathieu, Slim Essid, Thomas Fillon, Jacques Prado, and Gaël Richard. 2010. YAAFE, an easy to use and efficient audio feature extraction software. In Proc. 11th Int. Soc. Music Inform. Retrieval Conf. Utrecht, Netherlands, 441–446. [4] Fabien Gouyon, Bob L. Sturm, Joao L. Oliveira, Nuno Hespanhol, and Thibault Langlois. 2014. On evaluation validity in music autotagging. arXiv (Sep. 2014). arXiv:1410.0001 [5] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: machine learning in Python. J. Mach. Learning Res. 12 (Nov. 2011), 2825–2830.