SATIN

SATIN: A Persistent Musical Database for Music Information Retrieval 15th International Workshop on Content-Based Multimedia Indexing (CBMI), June 19-21, 2017, Florence, Italy Yann Bayle 2nd Year PhD Student [email protected]

Pierre Hanna Assistant professor [email protected]

Introduction

Matthias Robine Assistant professor [email protected]

Description of SATIN

Context Reproduce research in Music Information Retrieval (MIR) Bigger musical database Label tracks in a musical database Reduce computation time Manage copyright restrictions

Targeted applications Classification tasks in MIR Specifically Instrumentals and Songs Playlists creation Music recommendation

Description of SOFT1 First Set Of FeaTures extracted using SATIN

Set of Audio Tags and Identifiers Normalized 407,911 tracks from 194 countries Referenced by: − ISRC (International Standard Recording Code) − MusicBrainz id − Musixmatch id − SoundCloud id − Spotify id Ground truths for: − 106,466 Songs − 13,944 Instrumentals − 295 Genres − 194,290 tracks with at least one genre − 29,755 tracks with at least two genres Python API: − Figures − Lyrics − Metadata (artist, track,…) Reproducible research code, experiment, database and ground truths: https://github.com/ybayle/research

Sotwares

11% Instru

Essentia [1] Marsyas [2] YAAFE [3]

89% Songs

Features Essentia: more than 700 features at track scale Marsyas: features at track scale by frames of 23.2 ms [4]: − zero crossing rate − spectral centroid − roll-off and flux − 13 MFCC YAAFE: features at frame scale of 93.0 ms [4]: − 13 MFCC

Fig. 1. Geographical breakdown of the registration country according to the ISRC in SATIN.

Fig. 2. Songs and Instrumentals proportion in SATIN.

Use-case experiment Classification of Instrumentals and Songs

Parameters

Precision 0.889 ± 0.015

Fig. 3. Number of tracks by registration year according to the ISRC in SATIN.

10-fold cross-validation 13 MFCC per track five nearest neighbours scikit-learn Python package [5]

Recall 0.906 ± 0.009

F-score 0.891 ± 0.014

Fig. 4. Word cloud for the genres in SATIN.

Industrial partners Accuracy 0.906 ± 0.009 Metadata

Conclusion ISRC and tracks

SATIN Big database for classification of Instrumentals and Songs Large variety of tracks Reproducible experiments

SOFT1 Large variety of features Reproducible feature extraction Reusable features

Perspectives Gather more ground truths Find new sources for ground truths Compare the results of MIR algorithms Develop industrial partnerships

References [1] Dmitry Bogdanov, Nicolas Wack,Emilia Gómez, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Justin Salomon, José R. Zapata, and Xavier Serra. 2013. Essentia: an audio analysis library for music information retrieval. In Proc. 14th Int. Soc. Music Inform. Retrieval Conf. Curitiba, Brazil, 493–498. [2] George Tzanetakis and Perry Cook. 2000. Marsyas: A framework for audio analysis. Organised sound 4, 3 (Nov. 2000), 169–175. [3] Benoît Mathieu, Slim Essid, Thomas Fillon, Jacques Prado, and Gaël Richard. 2010. YAAFE, an easy to use and efficient audio feature extraction software. In Proc. 11th Int. Soc. Music Inform. Retrieval Conf. Utrecht, Netherlands, 441–446. [4] Fabien Gouyon, Bob L. Sturm, Joao L. Oliveira, Nuno Hespanhol, and Thibault Langlois. 2014. On evaluation validity in music autotagging. arXiv (Sep. 2014). arXiv:1410.0001 [5] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: machine learning in Python. J. Mach. Learning Res. 12 (Nov. 2011), 2825–2830.