Neural Information Processing Scaled for Bioacoustics

24 downloads 193 Views 14MB Size Report
ll. Proceedings of NIPS4B, international workshop joint to NIPS, USA, 2013. Neural Information Processing. Scaled for Bioacoustics. -from Neurons to Big Data-.
llll

l

Neural Information Processing Scaled for Bioacoustics -from Neurons to Big Data-

Proceedings of NIPS4B, international workshop joint to NIPS, USA, 2013 Glotin H., LeCun Y., Artières T., Mallat S., Tchernichovski O., Halkias X. Toulon, New-York, Paris



ISBN 979-10-90821-04-0 http://sabiod.univ-tln.fr

ISBN: 979-10-90821-04-0

We acknowledge CNRS MASTODONS MI for its support to SABIOD.ORG project which co-organized this workshop. We thank BIOTOPE, PhD Yves Bas, and O. Dufour who co-organized the Bird challenge of this workshop. We thank ADEME who supports Dufour's grant. These proceedings were compiled with the help of R. BALESTRIERO, Toulon University. We thank O. Dufour for flyers edition, Toulon University - LSIS UMR CNRS - DYNI SABIOD, Dec. 2013

Legend of the cover : Left image : spectrogram of the challenge 2 humpback whale song. [from V. Lostanlen, ENSDI, Paris]. Right picture : Minke whale Fourier time-frequency representation, on 5 minutes scale (left) versus on two years scale (right), showing seasons effects, global frequency shift. [from L. Kindermann 2013, in this book section 9.2]

This book is available on line at http//sabiob.org/NIPS4B2013_book.pdf Please cite its content with: " Proc. of Neural Information Processing Scaled for Bioacoustics: from Neurons to Big Data, 2013, Glotin H., LeCun Y., Artieres T., Mallat S., Tchernichovski O., Halkias X., joint to NIPS Conf., http://sabiod.org/nips4b, ISSN 979-10-90821-04-0 " Bibtex = @proceedings{procnips4B2013, title={Proc. Neural Information Processing Scaled for Bioacoustics, from Neurons to Big Data}, year={2013}, author={Glotin H., LeCun Y., Arti\`eres T., Mallat S., Tchernichovski O., Halkias X.}, organization={NIPS Int. Conf.}, address={USA}, note={http://sabiod.org/nips4b}, key={ISSN 979-10-90821-04-0}, }

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

2

Contents Video supports: Most of the talks are available in video at http://sabiod.univ-tln.fr/nips4b/

Authors list..............................................................................................................................7 Chapter 1 Introduction............................................................................................... 9 1.1 Objectives..................................................................................................................11 1.2 2YHUYLHZRIWKH%LUG challenge...............................................................................12 1.3 Whale song clustering challenge .............................................................................14 1.4 Neurosonar analysis.................................................................................................15 1.5 Acknowledgements................................................................................................... 20

Chapter 2 Natural Neural Bioacoustic Learning ...............................21 2.1 Physiological brain processes that underlie song learning....................................... 22 Tchernichovski O.

2.2 Neuroethology of hearing in crickets: embeded neural process to avoid bat........39 Pollack G.

Chapter 3 Representation for Bioacoustics.............................................47 3.1 Dynamic timewarping and gaussian process multinomial probit regression for bat call identification ...................................................................................48 Stathopoulos V., Zamora-Gutierrez V., Jones K., Girolami M.

3.2 Whale songs classification using sparse coding ........................................................ 56

Glotin H., Razik J., Paris S., Adam O., Doh Y.

3.3 Classification of mysticete sounds using machine learning techniques .................69 Halkias X., Paris S., Glotin H.

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

3

Chapter 4 Advanced Artificial Neural Net................................................77 4.1 Convnets & DNN for bioacoustics... ............................................................................78

LeCun Y.

4.2 Mapping functional equations to the topology of networks yields a natural interpolation method for time series data...................................... 79 Kindermann L., Lewandowski A.

Chapter 5 Learning to Track by Passive Acoustics ..........................87 5.1 Mono-channel spectral attenuation modeled by hierarchical neural net estimates hydrophone-whale distance ........................................................ 88 . Doh Y., Glotin H., Razik J., Paris S.

5.2 Physeter localization: sparse coding & fisher vectors ........................................... 97 Paris S., Glotin H., Doh Y., Razik J.

5.3 Range-depth tracking of multiple sperm whales over large distances using a two element vertical array and rhythmic properties of click-trains ....................................................................................................103 Mathias D., Thode A., Straley J., Andrews R., Le Bot O., Gervaise C., Mars J.

5.4 Optimization of Levenberg-Marquardt 3D biosonar tracking ........................... 108

Mishchenko A., Giraudet P., Glotin H.

5.5 Data driven approaches for identifying information bearing features in communication calls........................................................................................ 116 Elie J., Theunissen F., Wills H.

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

4

Chapter 6 Non Human Speech Processing.................................................133 6.1 Gabor Scalogram Reveals Formants in High-Frequency Dolphin Clicks ............134 . Trone M., Balestriero R., Glotin H. 6.2 Supervised classification of baboon vocalizations...................................................143

Janvier M., Horaudm R., Girin L., Berthommier F., Boe L., Kemp C., Rey A., Legou T.

6.3 Software tools for analyzing mice vocalizations with applications to pre-clinical models of human disease ...................................................153 Shokoohi-Yekta M., Zakaria J., Rotschafer S., Mirebrahim H., Razak K., Keogh E. E

Chapter 7 Bird Song Classification Challenge...................................... 163 7.1 Multi-instance multi-label acoustic classification of plurality of animals : birds, insects & amphibian........................................................................... 164 Dufour O., Glotin H., Bas Y., Artieres T., Giraudet P.

7.2 Bird song classification in field recordings: winning solution for NIPS4B 2013 competition ........................................................176 Lasseck M.

7.3 Feature design for multilabel bird song classification in noise (NIPS4B challenge) .......................................................................................

182

Stowell D., Plumbley M.

7.4 Learning multi-labeled bioacoustic samples with an unsupervised feature learning approach ........................................................................ 184 Mencia E., Nam J., Lee D.

7.5 Ensemble logistic regression and gradient boosting classifiers for multilabel bird song classification in noise (NIPS4B challenge)....................... 190

Massaron L.

7.6 A novel approach based on ensemble learning to NIPS4B challenge .............. 195 Chen W., Zhao G., Li X.

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

5

Chapter 8 Whale Song Clustering................................................................... 199 8.1 Analyzing the temporal structure of sound production modes within humpback whale sound sequences............................................................200 Mercado III E. Mercado III E.

8.2 Unsupervised whale song decomposition with Bayesian non-parametric Gaussian mixtur e....................................................................................205 Bartcus M., Chamroukhi F., Razik J., Glotin H. Bartcus M., Chamroukhi F., Razik J., Glotin H.

8.3 Classifying humpback whale sound units by theirvocal physiology, including chaotic features....................................................................................................212 Cazau D., Adam O.

8.4 Gabor scalogram for robust whale song representation ........................................218 Balestriero R., Glotin H.

227 8.5 Automatic analysis of a whale song.............................................................................. . Potamitis L., Ntalampiras S.

Chapter 9 Big Bioacoustic DATA...................................................................237 9.1 Cabled observatory acoustic data: challenges and opportunitie ..........................238 Hoeberechts M.

9.2 A challenge for computational bioacoustics................................................................246 Kindermann L.

Annex : Schedule.............................................................................................................255

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

6

Authors list (not including all participants)

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

7

5.5  Data  driven  approaches  for  identifying  information   bearing  features  in  communication  calls.  

 

Julie  E.  Elie  and  Frédéric  E.  Theunissen.  UC  Berkeley.  Dept  of  Psychology  and  Helen   Wills  Neuroscience  Institute.    

Bioacousticians   have   traditionally   investigated   the   acoustical   nature   of   information   bearing   features   in   communication   calls   by   describing   sounds   using   a   small   number   of   acoustical   parameters   that   appear   particularly   salient   (e.g.   the   mean   frequency,   duration,   spectral   balance).   These   measures   are   then   used   as   parameters   for   linear   discriminant   analyses   (LDA)   or   other   supervised   learning   approaches   to   investigate   what   acoustic   parameters  drive  sound  categorization.      This  classical  approach  is  computationally  efficient   and  yields  results  that  are  easily  interpretable.      However  this  approach  can  also  be  limited   by   the   a   priori   choice   of   the   putative   information   bearing   features:   as   long   as   the   sound   representation   is   not   complete,   one   will   not   be   able   to   determine   whether   the   correct   information   bearing   features   are   identified   and,   thus,   whether   the   actual   amount   of   information   present   in   the   calls   (measured   for   example   as   the   quality   of   a   discrimination   test)  is  correctly  estimated.   To   address   this   issue,   we   have   adopted   a   data   driven   approach.     In   the   traditional   approach,   specific   acoustical   parameters   are   chosen   for   two   reasons:   for   dimensionality   reduction  and  for  the  implementation  of  a  non-­‐linear  transformation  that  could  be  required   for   linear   discriminant   approaches   to   effectively   discriminate   among   sound   categories.     These   two   steps,   however,   can   be   implemented   without   a   priori   assumptions   or   loss   of   information.    In  our  approach,  our  non-­‐linear  transformation  is  an  invertible  spectrographic   representation   of   the   sound.     Then   before   using   a   classifier,   the   dimension   of   this   high-­‐ dimensional  representation  is  reduced  using  principal  component  analysis  (PCA).    For  this   approach   to   work,   the   spectrograms   of   the   sounds   must   be   aligned   and   cross-­‐validation   techniques   that   evaluate   both   the   effect   of   the  number   of   PCs   in   the   PCA   and   the   number   of   parameters  in  the  classifier  must  be  implemented  to  prevent  over-­‐fitting.    In  our  approach   alignment   of   spectrograms   was   based   on   the   cross-­‐correlation   between   amplitude   envelopes.     We   then   used   proven   cross-­‐validation   techniques   to   prevent   over-­‐fitting:   bootstrap  for  LDA  and  the  Random  Forest  algorithm  for  non-­‐linear  tree  classifiers.  Finally,   we   compared   the   classification   performance   of   the   two   classifiers   on   this   sparse   spectrographic  representation  of  the  sound  (PCA  on  spectrogram)  with  those  obtained  for   two   other   feature   spaces:   the   Mel   frequency   cepstral   coefficients   (MFCC)   and   the   modulation  power  spectrum  (MPS).     These   approaches   were   tested   for   the   analysis   of   calls   in   a   unique   database   of   1275   zebra  finch  communication  calls  obtained  in  our  laboratory.    This  database  includes  many   exemplars  of  all  the  call  types  in  the  zebra  finch  repertoire  for  a  large  number  of  male  and   female   birds.       Our   algorithms   were   used   to   determine   the   discriminability   of   vocal   types.     The   PCA   on   spectrogram   yielded   the   best   feature   space:   these   features   are   both   easy   to   interpret  and  yield  higher  performance  of  classifiers.  The  best  results  were  obtained  using   the   Random   Forest   algorithm   and   a   PCA   spectrographic   representation   using   50   PCs;   we   show   that   the   11   distinct   call   types   in   the   zebra   finch   repertoire,   irrespective   of   the   identity   of   the   vocalizing   bird,   could   be   classified   with   83.1%   of   accuracy.   This   classification  

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

116

performance   is   significantly   higher   that   what   one   could   obtain   from   any   of   the   two   other   sound  representations  tested  with  the  Random  Forest  algorithm  (MFCC,  53.1%  of  accuracy;   MPS,   63.5%   of   accuracy).   Besides,   the   Random   Forest   yielded   better   classification   performance  than  LDA,  irrespective  of  the  feature  space  used.  In  conclusion,  the  data  driven   algorithm  using  the  DFA  on  the  spectrogram  showed  superior  results  and  we  propose  that   it   could   be   used   both   for   investigating   behavioral   and   neural   mechanisms   of   sound   discrimination   and   for   the   vocalization   based   identification   of   species   in   ecological   or   environmental  studies.          

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

117

Mo#va#on' •  Classical'approaches'in'bio2acous#cs'and' sound'analyses'~'Using'simple'(ad'hoc)' features.'' –  Hyena'Vocaliza#ons'

•  Percep#on'and'Neural'Representa#on.'

In: proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, dec. 2013, Ed. Glotin H. et al.

118

Individual'Signature'in'the'Hyena'giggle'sounds.'

33%'Correct'

Mathevon'et'al,'BMC'Ecology'2010'

Zebra'Finch'Complete'Repertoire'is' Complex.' 6 1@@@ D222

)*$+,$-./

)*$+,$-./

1???

4 C@@@ A222 2 2@@@ ?222 0 A2 @ 1 ?0 2

0 @

B

>:::

)*$+,$-./

)*$+,$-./ Frequency (kHz)

Distance Nest Long Tonal call 0!"#%12%34-&.566(783&9%#%:5#"6"5*(%;?%@AAA Begging 0!"#%123%45-&.677(89:&9%#%,-;6#"7"6*(%?%@=A>AAA 0!"#%123%45-&.677(89:&9%#%,-;6#"7"6*(%?%@=A>AAA Tet 0!"#%123%45-&.677(89$&:%;%;6#"7"6*(%?@%A=B?BBB 0!"#%12%34-&.566(78$&3%#%,-95#"6"5*(%:;%?;@=@@@ 0!"#%123%45-&.677(89"&:%#%,-;6#"7"6*(%?@%A=3?333 0!"#%11%23-&.455(678&2%#%,-94#"5"4*(%:;AAA uck 0!"#%123%45-&.677(89:&;%2@222 Call E222 C??? 1::: 8 D@@@