Memory Neural-Network

30 downloads 0 Views 908KB Size Report
May 26, 2018 - coming from the tube amplifier and the output of the neural network is ... A Convolutional Reduction (CR) layer is thus introduced to reduce the.
Nonlinear Real-Time Emulation of a Tube Amplifier with a Long Short Term Memory Neural-Network Thomas Schmitz, Jean-Jacques Embrechts [email protected], [email protected] INTELSIG Laboratory, Montefiore Institute, University of Liège, Belgium recorded and used to adapt the bias and weights of a Neural-Network such as the signal x[n] passing through the Neural-Network model matches the signal y[n].

Abstract Numerous audio systems for musicians are expensive and bulky. Therefore, it could be advantageous to model them and to replace them by computer emulation. Their nonlinear behavior requires the use of complex models. We propose to take advantage of the progress made in the field of machine learning to build a new model for such nonlinear audio devices (such as the tube amplifier). This paper specially focuses on the real-time constraints of the model. Modifying the structure of the Long Short Term Memory neural-network has led to a model 10 times faster while keeping a very good accuracy. Indeed, the root mean square error between the signal coming from the tube amplifier and the output of the neural network is around 2%.

1. Goal Hardware audio effects such as those used by musicians (distortion effects, compressors,...) could be replaced by a software emulation of them. The advantages: • Easier to transport. • Cheaper to buy. • More polyvalent.

The difficulties: • Complexity of the nonlinear models. • Processing in real-time.

3. LSTM method Long Short Term Memory (LSTM) cells are used in our network. An input buffer of size N is used to compute the predicted output sample pred[n] which is compared to the real output sample y[n] to compute a cost function. The Back Propagation Through Time (BPTT) algorithm computes the weights and bias of the network to minimize this cost.

LSTM Cell

cell_state

...

x[0]

LSTM Cell

FC

x[N-1]

y[n]

pred[n]

Cost

The size of N is chosen to take into account the frequency and amplitude nonlinear dependencies.

4. Real-time constraint The LSTM layer architecture does not take advantages of the Graphic Processor Unit (GPU) capability since the N cell_states can not be computed in parallel. This slow down the emulation process. A Convolutional Reduction (CR) layer is thus introduced to reduce the number of LSTM cells used while attempting to preserve the global accuracy. C kernels of length L will be convoluted with the input signal buffer x (size N ) to give C signals d of size (M < N ). These signals are then sent to the LSTM layer of size M.

pred[n]

with M