Designing efficient architectures for modeling temporal ... - Jordi Pons

Recommend Documents

Large data for training large models. ... 76.66/85.24 % 4x O-net + 4x P-net 250/80 ... Limited amount of data. Architect

DESIGNING EFFICIENT ARCHITECTURES FOR MODELING ... - MTG

validate the foundations of our novel design strategy, that promotes an efficient use of the .... using 10-fold cross

Randomly weighted CNNs for audio classification: a ... - Jordi Pons

MLP: multi layer perceptron â¡ feed-forward neural network. RNN: recurrent neural network. LSTM: long-short term memory

Daniel A. Tirado Jordi Pons Elisenda Paluzie CSGR ... - Warwick WRAP

Jan 9, 2009 - Abstract: In the spirit of Hanson (1997), we analyze the existence of regional nominal wage gradients in Spain during the interwar period ...

Designing helical architectures

Dec 1, 2009 - building blocks that self-assemble into helical structures? In seeking a ... interaction sites within a rigid-body framework (18). Each dumbbell ...

Designing For Cisco Network Service Architectures (ARCH ...

Learning Guide Book, Download PDF Designing For Cisco Network Service .... modular enterprise architecture Â· Create hig

Dataset Designing of Software Architectures Styles for

National University of Science and Technology Rawalpindi,. Pakistan .... A data set is a group of information, often presented in a ..... Technologies (CISTI),2016.

Designing for Cisco Network Service Architectures (ARCH ...

free ebook online download Designing for Cisco Network Service ... Learning Guide: CCDP ARCH 300-320 (Foundation Learnin

Designing Lightweight Software Architectures for Smart ... - CiteSeerX

However, these applications rely on a software ... to interoperate with one or more pieces of customized software to achieve the goals of the overall system.

Software Architectures for Designing Virtual Reality Applications

tual reality applications in order to reduce the effort needed in the ..... customize the software architecture depends on the application and on the specific VR.

Temporal Resources for Global Quantum Computing Architectures

Dec 4, 2008 - sociated with two level quantum systems interacting with their nearest neighbours are .... Proposed by Benjamin [11], this is one of the simplest ...

Designing Software Architectures as a

In particular, we wanted to find answers ... 6.1. Section 6.2 gives information about the implementations of the ... element in a matrix represents a concept in the corresponding knowledge domain. ... For each knowledge domain, we organize.

Energy efficient DBA algorithms for TWDM-PONs - UGent Biblio

and widely adopted FTTH technology is a passive optical network (PON). The PON ... deposited by WDM with the inherent resource granularity of a TDM PON.

An energy efficient cyclic sleep control framework for ITU PONs

Jun 17, 2017 - An energy efficient cyclic sleep control framework for ITU PONs. Rizwan Aslam Butta,b, Sevia Mahdaliza Idrusa,â. , Kashif Naseer Qureshid, ...

Modeling Information Systems Architectures

s Designing IS architectures s Architectural platforms ... Architecture of application software,. e.g. using data flow .

Architectures for Energy-Efficient IPTV Networks - UniTN

network providers are beginning to build private video delivery networks to deliver ... storage, content servers and a local area network (LAN) [8]. ... Cisco 10008.

Efficient Fast Convolution Architectures for Convolutional Neural

paper, we first present fast convolution algorithm and its matrix form. Then based on the fast ... hardware reconfigurability, convolutional neural network (CNN).

Efficient Space Leaping for Ray casting Architectures

for CSG [1], volume rendering [13, 16], haptics in volume rendering [4] ... calculating a small occupancy map containing a single bit per sub- cube of the volume ..... The percentage of empty voxels is given in Table 1, but does not reveal any ...

Preference of Efficient Architectures for GF(p

intruders more easily than hardware, jeopardizing the whole application security [21]. .... For elliptic curve defined over GF(p), two different forms of formulae are.

An Efficient Multiway Mergesort for GPU Architectures

Mar 30, 2017 - â [email protected]. â¡[email protected]. Â§[email protected]. Â¶[email protected]. 1. arXiv:1702.07961v3 [cs.DS] 30 Mar 2017 ...

EFFICIENT HARDWARE ARCHITECTURES FOR MPEG-4 CORE ...

ABSTRACT. Efficient hardware acceleration architectures are proposed for the most demanding MPEG-4 core profile algorithms, namely; tex- ture motion ...

Series Expansion based Efficient Architectures for

Apr 25, 2014 - M. Balakrishnan Â· K. Paul. Department of Computer Science Engineering, Indian Institute of Technology, Delhi, India e-mail: [email protected].

Efficient Algorithms and Architectures for High

Dec 6, 2004 - of ASIC design, including potential for parallelism, very efficient ..... This problem is a standard graph theory problem in physical design automation known as âmin- ..... However, in practice, buffers and synchronization will.

Efficient Integer DCT Architectures for HEVC

of Section II-B. The SAU provides the result of multiplication of input sample with DCT coefficient by STAGE-2 of the algorithm. Finally, the OAU generates the ...

Designing efficient architectures for modeling temporal ... - Jordi Pons

Download PDF

0 downloads 136 Views 530KB Size Report

Comment

Designing efficient architectures for ... Large data for training large models. ... Efficiency. â» Over-fitting. Design

Designing efficient architectures for modeling temporal features with CNNs https://github.com/jordipons/ICASSP2017 Jordi Pons and Xavier Serra Music Technology Group, Universitat Pompeu Fabra, Barcelona [email protected] - www.jordipons.me - @jordiponsdotme

Proposal: design strategy

Introductory discussion Use case: music classification

Architecture: 1-by-n filters

Motivation MP(4,N’)

I Approach: single-layer CNNs.

I Domain knowledge available.

I Input: log-mel spectrograms.

I Intuition → interpretability.

I Dataset: Ballroom. Interesting because: I

Small dataset size.

I Semantically relevant contexts are expressed in different scales.

I

Tempo and rhythm features.

I Minimize CNN number of parameters:

Basic observations

I

Efficiency.

I

Over-fitting.

I

Promotes an efficient use of the representational capacity of the first layer by using different musically motivated filter shapes that model several contexts.

I Influence from the computer vision field: I

seeing spectrograms.

I

3x3 filters – limit the representational power of the first layer!

I

Seeing a semantically irrelevant context.

I

Need to combine. Hebbian principle:

M

I O(nsets)-net (5x) → where n ∈ [6, 11, 16, 21, 26, 31, 36, 41].

Design strategy

Large data for training large models.

...

P-net

N

I Filters dimensions interpretable. I Unreasonable assumption:

O-net

I P(atterns)-net (1x) → where 46 ≤ n ≤ 216 (n ∈ 46 + 5 · f ). I Max-pool layer interpretation: → sub-bands analysis.

STFT-window: 2048 (50% hop) at 44.1 kHz: Underlying hypothesis CNNs can benefit from a design oriented towards learning musical features rather than seeing spectrograms.

nO ≡ ∆Fr |bpm=60

44100 × 60(sec) = = 43 60(bpm) × 2048 × 0.5

nP ≡ 1 + 5 · ∆Fr |bpm=60 = 216 Tempo ∈ [60, 224]

M=40

N=250

Results Model: O-net P-net 2x O-net O-net + P-net 2x P-net 2x O-net + 2x P-net Fair comparison:

hop

# params

accuracy

250/80 250/80 250/80 250/80 250/80 250/80

4,188 7,428 8,368 11,608 14,848 23,208

76.66/85.24 % 83.95/89.26 % 81.53/86.54 % 87.25/89.68 % 85.67/89.11 % 87.25/91.27 %

Model: 4x O-net + 4x P-net 8x O-net + 8x P-net Time-freq [2] Time [2] Black-box [2] Marchand et al. [1]

hop

# params

accuracy

250/80 250/80 80 80 80 -

46,408 92,808 196,816 7,336 3,275,312 -

88.82/91.55 % 88.68/92.27 % 87.68 % 81.79 % 87.25 % 96 %

→ When setting hop = N = 250, no input spectrograms overlap is used as in [2]. → When setting hop = 80 as many training examples as in [2], although overlapping data is used.

Conclusions I O-net + P-net: best (presented) models. → having the most diverse combination of filter shapes. → validates design strategy! → efficient adaptation of CNNs for music!

Filters example

Baselines I Time-frequency and Time:

I ⇓ data – ⇑ useful the design strategy. I Adding different filter shapes: → increasing the representational capacity of the first layer at a very low cost. → cheaper than 2x CNN’s capacity.

I Black-box:

12×8 filters and MP(4,1).

I P-net most succesful ones! → as was intuitively designed. → step towards interpretability of CNNs! Experimental constraints → Only 1-by-n filters are used → Only a single layer is used. → Limited amount of data.

References

I Marchand et al.: based on a scale and shift invariant time/frequency representation that uses auditory statistics, not deep learning.

[1] Pons et al. Experimenting with musically motivated CNNs. CBMI, 2016. [2] Marchand et al. Scale and shift invariant time/frequency representation using auditory statistics: application to rhythm description. MLSP, 2016.