Scalable Hardware Architecture for Memristor Based Artificial Neural ...

Scalable Hardware Architecture for Memristor Based Artificial Neural Network Systems A thesis submitted to the Graduate School of the University of Cincinnati in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE in the Dept. of Electrical Engineering and Computing Systems of the College of Engineering and Applied Sciences May 2016 by

Ananthakrishnan Ponnileth Rajendran B.Tech, Amrita Vishwa Vidyapeetham University, Kerala, India May 2013 Thesis Advisor and Committee Chair: Dr. Ranga Vemuri

Abstract Since the physical realization of the Memristor by HP labs in 2008, research on Memristors and Memristive devices gained momentum, with focus primarily on modelling and fabricating Memristors and in developing applications for Memristive devices. The Memristor’s potential can be exploited in applications such as neuromorphic engineering, memory technology and analog and digital logic circuit implementations. Research on Memristor based neural networks have thus far focused on developing algorithms and methodologies for implementation. The Memristor Bridge Synapse, a Wheatstone bridge-like circuit composed of four Memristors is a very effective way to implement weights in hardware neural networks. Research on Memristor Bridge Synapse implementations coupled with the Random Weight Change Algorithm proved effective in learning complex functions with potential for implementation on hardware with simple and efficient circuity. However, the simulations and experiments conducted was purely on software and was only proof of concept. Realizing neural networks using the Memristor Bridge Synapse capable of on-chip training requires an effective hardware architecture with numerous components and complex timing. This thesis presents a scalable hardware architecture for implementing artificial neural networks using the Memristor Bridge Synapse capable of being trained on-chip using the Random Weight Change algorithm. Individual components required for implementing training logic, timing and evaluation are described and simulated using SPICE. A complete training simulation for a small neural network based on the proposed architecture was performed using HSPICE. A prototypical placement and routing tool for the architecture is also presented.

ii

To my parents and my sister. Thank you for being my inspiration.

In memory of my friends Govind and Srinivas. You’ll forever be in my heart.

iv

Acknowledgements I would like to start by thanking the most important people in my life, my family. My parents Rajendran and Rajam have made a lot of sacrifices to help my sister Malavika and I to realize our dreams. Thank you very much for believing in me and motivating me towards realizing my goals. I will forever be indebted to you. I consider myself very lucky to have received the opportunity to work under Dr. Ranga Vemuri. The knowledge you imparted will forever stay with me. Thank you very much for letting me be a part of DDEL and guiding me through my Master’s journey. Thank you Dr. Wen-Ben Jone and Dr. Carla Purdy for being part of my defense committee. Thanks to Rob Montjoy for providing continuous support with the DDEL machines. Special thanks to my friend Prabanjan for our innumerable discussions and the ideas you gave me to put my work together. I would like to thank my friends Diwakar, Ashwini and Meera for providing a helping hand on numerous occasions. Thank you Renuka for reviewing my thesis. I would like to thank all my teachers from primary school through college for moulding me into the person I am today. Special thanks to Dr. Rajesh Kannan Megalingam for inducing interest in the field of VLSI in me and motivating me to pursue a Master’s degree. Last but not least, thanks to all my friends and relatives for being a part of my journey of life. I will forever be grateful for your help and support.

v

Contents

1 Introduction

1

1.1

The Memristor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2

Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

Artificial Neural Networks on Hardware . . . . . . . . . . . . . . . . . . .

7

1.3.1

Analog Neural Network Implementations . . . . . . . . . . . . . .

7

1.3.2

Memristor Based Neural Networks . . . . . . . . . . . . . . . . .

10

1.4

Random Weight Change Algorithm . . . . . . . . . . . . . . . . . . . . .

15

1.5

Memristor Bridge Synapse . . . . . . . . . . . . . . . . . . . . . . . . . .

17

1.6

Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

1.7

Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2 The Memristor Neural Network Architecture

22

2.1

Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.2

Architecture Components . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.2.1

Neuron Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.2.2

Microcontroller . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

2.2.3

Shift Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

2.2.4

Connection Buses . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

2.3

Memristor Bridge Synapse Bit-Slice . . . . . . . . . . . . . . . . . . . . .

35

2.4

Architecture in a nutshell

. . . . . . . . . . . . . . . . . . . . . . . . . .

36

2.5

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

vi

CONTENTS 3 Placement and Routing Tool for Memristor Neural Network Architecture

38

3.1

Tool Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

3.2

Tool Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.3

Output and Performance Analysis . . . . . . . . . . . . . . . . . . . . . .

43

3.3.1

Area Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

3.3.2

Runtime Performance . . . . . . . . . . . . . . . . . . . . . . . . .

47

3.4

Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

3.5

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

4 Experimental Results and Analysis

50

4.1

Memristor Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

4.2

Memristor Bridge Synapse Simulation . . . . . . . . . . . . . . . . . . . .

54

4.3

Memristor Bridge Synapse Bit-Slice Simulation . . . . . . . . . . . . . .

57

4.4

Simple Neural Network Simulation . . . . . . . . . . . . . . . . . . . . .

59

4.5

OR-Gate Training in SPICE . . . . . . . . . . . . . . . . . . . . . . . . .

62

4.5.1

Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . .

62

4.5.2

Observation and Analysis . . . . . . . . . . . . . . . . . . . . . .

65

Power and Timing Estimation . . . . . . . . . . . . . . . . . . . . . . . .

66

4.6.1

Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

4.6.2

Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

4.7

Training Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

4.8

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

4.6

5 Conclusion and Future Work

74

5.1

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

5.2

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

5.2.1

Implementing Stronger Activation Function . . . . . . . . . . . .

75

5.2.2

Linear Feedback Shift Register for Random Bits . . . . . . . . . .

76

5.2.3

Implementing other Hardware Friendly Algorithms . . . . . . . .

76

vii

CONTENTS 5.2.4

Bit-slice in Layout . . . . . . . . . . . . . . . . . . . . . . . . . .

76

5.2.5

Testing with more Memristor Models . . . . . . . . . . . . . . . .

76

5.2.6

Reconfigurable Neural Network . . . . . . . . . . . . . . . . . . .

77

Bibliography

78

viii

List of Figures

1.1

Conceptual symmetries of the four circuit variables with the three classical circuit elements and the memristor [1]. . . . . . . . . . . . . . . . . . . .

2

1.2

Cross section of HP’s Crossbar Array showing the memristor switch [2]. .

3

1.3

Representation of a simple three-layered artificial neural network [3]. . . .

5

1.4

Differential floating gate synapse schematic diagram of Electrically Trainable Analog Neural Network (ETANN) [4]. . . . . . . . . . . . . . . . . .

1.5

Analog current synapse, synapse current input, weight-control and neuron output circuit schematic of model proposed in [5]. . . . . . . . . . . . . .

1.6

9

Schematic of a weight cell of CMOS integrated feed-forward neural network [6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.7

8

10

Excitatory neuron with the input sensing circuit of Memristor Crossbar Architecture for Synchronous Neural Networks [7]. . . . . . . . . . . . . .

11

1.8

Weighting and Range Select circuit for RANLB and MTNLB [8]. . . . . .

12

1.9

(a) Activation function circuit for RANLB. (b) Activation function circuit for MTNLB [8]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

1.10 Circuit that accomplishes weighting using the Memristor bridge synaptic circuit and voltage-to-current conversion with differential amplifier in [9]. 1.11 (a) Typical multi-layered neural network inputs in voltage form.

13

(b)

Schematic of learning architecture for the equivalent hardware for the neural network in (a) [9]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

1.12 Flowchart for Random Weight Change Algorithm. . . . . . . . . . . . . .

15

ix

LIST OF FIGURES 1.13 Illustration of energy surface tracing by back-propagation and random weight change algorithm [9]. . . . . . . . . . . . . . . . . . . . . . . . . .

16

1.14 Memristor Bridge Synapse Circuit [10]. . . . . . . . . . . . . . . . . . . .

17

2.1

Sample input for face pose identification problem [11]. . . . . . . . . . . .

22

2.2

Three layered neural network for face pose identification. . . . . . . . . .

23

2.3

Memristor based neural network architecture for face pose identification.

24

2.4

Simple three-layered neural network. . . . . . . . . . . . . . . . . . . . .

26

2.5

Memristor Bridge Synapse design. . . . . . . . . . . . . . . . . . . . . . .

27

2.6

Summing logic for neuron N3 from Figure 2.4. . . . . . . . . . . . . . . .

28

2.7

Summing circuit using voltage average and operational amplifier circuits.

29

2.8

Difference circuit using differential amplifier. . . . . . . . . . . . . . . . .

30

2.9

Neuron N3 inputs and output for the neural network in Figure 2.4. . . .

32

2.10 Memristor Bridge Synapse Bit-Slice.

. . . . . . . . . . . . . . . . . . . .

36

3.1

Placement of 10 blocks of output layer on layout represented with p-diffusion. 41

3.2

After routing of input bus for placed blocks in Figure 3.1. . . . . . . . . .

3.3

Completed placement and routing for neural network with 30 inputs, 10

42

hidden layer neurons and 10 output layer neurons. . . . . . . . . . . . . .

43

3.4

Flowchart showing the tool flow for placement and routing. . . . . . . . .

44

3.5

Layout for face pose identification neural network with 960 inputs, 10 hidden layer neurons and 4 output layer neurons. . . . . . . . . . . . . .

45

3.6

Output layer layout for face pose identification neural network. . . . . . .

46

3.7

Output layer layout for neural network with 80 inputs, 12 hidden layer neurons and 15 output layer neurons. . . . . . . . . . . . . . . . . . . . .

3.8

47

Neural network with 80 inputs and 15 output layer neurons having two hidden layers with 30 neurons in the first hidden layer and 25 neurons in the second hidden layer. . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

48

Circuit for Memristor simulation with Memristor M1 (Ron =116Ω, Rof f =16kΩ) in series with resistor R1 (100Ω) and Voltage source Vin. . . . . . . . . . x

51

LIST OF FIGURES 4.2

Memristor simulation with DC voltage +1V and -1V. . . . . . . . . . . .

52

4.3

Resistance change in the memristor for millisecond input pulse-width. . .

53

4.4

Resistance change in the memristor for microsecond input pulse-width. .

54

4.5

Memristor Bridge Synapse circuit used for simulation. . . . . . . . . . . .

55

4.6

Memristor Bridge Synapse simulation waveform. . . . . . . . . . . . . . .

55

4.7

Evaluation pulse applied to Memristor Bridge Synapse. . . . . . . . . . .

56

4.8

Memristor Bridge Synapse Bit-Slice simulation waveform. . . . . . . . . .

58

4.9

Neural network training input application and output evaluation. . . . .

59

4.10 Neural network weight update pulse application. . . . . . . . . . . . . . .

60

4.11 Neural network output at evaluation during different iterations. . . . . .

61

4.12 Flowchart showing tool flow for neural network training simulator in SPICE. 62 4.13 Neural network output for learning OR-gate function at the start of simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

4.14 Neural network output for learning OR-gate function for 54th iteration of training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

4.15 Neural network output for learning OR-gate function at the end of simulation. 66 4.16 Mean squared error vs iterations for training OR-gate function. . . . . .

70

4.17 Mean squared error vs iterations for simulation 2. . . . . . . . . . . . . .

71


72


72


73

xi

List of Tables 2.1

Training input selection logic. . . . . . . . . . . . . . . . . . . . . . . . .

3.1

Comparison of total layout area for neural networks for different technology

35

nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

3.2

Fraction of unused area in layout for different neural networks . . . . . .

47

4.1

Instantaneous current and resistance measurements for forward biased memristor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2

Instantaneous current and resistance measurements for reverse biased memristor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

54

Weight change for different training signal pulse-widths for memristor bridge synapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.4

53

57

Comparison of training performance for multiple simulations for training OR-gate function in HSPICE . . . . . . . . . . . . . . . . . . . . . . . .

xii

70

Chapter 1 Introduction In 1971, Leon Chua presented an argument that a fourth two-terminal device should exist along with the three classical circuit elements, namely, the resistor, capacitor and inductor [12]. He named this fourth circuit element as the Memristor. Chua pointed out that the three basic circuit elements were defined based on a relationship between two of the four fundamental circuit variables current, voltage, charge and flux-linkage. There are six possible relationships between these four circuit variables, of which two are direct relationships. q=

Z

i(t)dt,

(1.1)

is the relationship between charge (q) and current (i) and,

φ=

Z

v(t)dt,

(1.2)

is the relationship between flux-linkage (φ) and voltage (v). The other three relations are based on the axiomatic definition of the three classical circuit elements. The resistor is defined by the relationship between current and voltage, the inductor by current and flux-linkage and the capacitor by the relationship between charge and voltage. Chua postulated based on a logical as well as axiomatic point of view that a fourth basic two-terminal device should exist, which can be characterized by charge and flux-linkage. There was no physical realization for such a two terminal device for over three decades 1

CHAPTER 1. INTRODUCTION

Figure 1.1: Conceptual symmetries of the four circuit variables with the three classical circuit elements and the memristor [1]. since Chua’s proposal, until in 2008 Dmitri B. Strukov et al. from HP Labs published an article claiming that they observed memristance arises naturally in nanoscale systems on coupling solid-state electronic and ionic transport under an external bias voltage [13]. Since this discovery, research on Memristors and Memristive devices gained momentum, with focus primarily on modelling and fabricating memristors and in developing applications for memristive devices. The memristor’s potential can be exploited in applications such as neuromorphic engineering, memory technology and analog and digital logic circuit implementations. The work presented in this thesis focuses on the application of the memristors in the area of artificial neural networks.

1.1

The Memristor

The Memristor is a two terminal device whose electrical resistance is not a constant, but varies depending on the amount of charge that flows through it. This variable resistance of the Memristor is termed as its Memristance. The memristor is non-volatile in nature, meaning that the device can remember its most recent resistance value even after it is 2

1.1. THE MEMRISTOR

Figure 1.2: Cross section of HP’s Crossbar Array showing the memristor switch [2].

disconnected from an electric power supply. This property of the memristor makes it very useful for various applications such as in designing efficient memories and hardware realizations of artificial neural networks. There have been several implementations for the memristor device such as the Polymeric Memristor, Layered Memristor, Ferroelectric Memristor, Spin Memristive systems etc. In this text, we will discuss the Titanium Dioxide Memristor that HP developed in 2008. Researchers Dmitri B. Strukov et al. developed the memristor while working on crossbar memory architecture at HP Labs. The crossbar is an array of perpendicular wires that are connected using swtiches at points where they cross. Their idea was to open and close these switches by applying voltages at the end of the wires. The design of these switches lead to the creation of the memristor. HP’s memristor is composed by sandwiching a thin layer of titanium dioxide (T iO2 ) between two platinum electrodes. The electrodes are about 5nm thick and the T iO2 layer is about 30nm thick. The T iO2 layer is divided into two separate regions, one composed of pure T iO2 and the other slightly depleted of oxygen atoms. These oxygen vacancies act as charge carriers and help conduct current through the device leading to a lower resistance in the oxygen depleted region. The application of an electric field results in 3

CHAPTER 1. INTRODUCTION a drift of these oxygen vacancies which results in a shift of the boundary between the low and high resistance regions. Figure 1.2 shows a cross sectional view of HP’s crossbar array with the memristor. If an electric field is applied across the two electrodes, it results in the boundary between the normal region and oxygen depleted region moving either towards or away from the upper platinum electrode. If the boundary moves towards the upper electrode, it results in higher resistance and vice versa. Thus, the resistance of the device is dependent on how much charge has passed through it in a particular direction. The memristance is observed only when both the pure and doped regions contribute to the resistance. After enough charge passes through the device, the ions becomes unable to move further and the device enters hysteresis. The device then acts as a simple resistor until the direction of the current is reversed. In 2010, R. Stanley Williams of HP labs reported that they were able to fabricate memristors as small as 3 nm by 3 nm in size that had a switching time of 1 ns (1 GHz speed). Such small dimension and great speed promises a lot of application for the memristor. In the work presented here, the memristor’s ability to provide a wide range of resistance values is utilized in creating synaptic weights for artificial neural networks. For simplicity, we have used the term ’resistance’ instead of ’memristance’ throughout in this text.

1.2

Artificial Neural Networks

Artificial neural networks are group of nodes that are connected using weighted edges. They are models inspired by biological neural networks and are used to estimate or approximate functions that usually depend on a large number of unknown inputs. The ability of artificial neural networks to adapt to a given set of circumstances is what makes them very attractive for applications such as pattern recognition, data mining, game-play and decision making, medical diagnosis etc. Neural networks adapt to a given set of inputs by modifying the weights of the interconnects between its neurons based on a suitable algorithm. An activation function at the neuron defines its output for an 4

1.2. ARTIFICIAL NEURAL NETWORKS input or set of inputs to it. There are mainly three learning paradigms, viz. supervised learning, unsupervised learning and reinforcement learning. Every neural network has one input layer and one output layer. It may have one or more hidden layers. Figure 1.3 shows a simple three-layered neural network. The number of neurons in each layer depend on the function that the network is trying to approximate. The neural networks discussed in this thesis are feed-forward neural networks i.e., data only flows in the forward direction and there is no feedback for the data while the network is evaluated. The neural network in Figure 1.3 is fully interconnected arrangement in the sense that every neuron in one layer is connected to every neuron in the succeeding layer. This not a necessity while designing a neural network since all connections may not be required to implement a specific function. However, it is very difficult to accurately predict the optimal number of hidden layer neurons and connections that a particular problem might require. The beauty lies in the fact that neural networks have the ability to learn whether or not a particular neuron or connection has a significant impact on its output.

Figure 1.3: Representation of a simple three-layered artificial neural network [3].

Supervised learning is one of the most commonly used learning method for artificial 5

CHAPTER 1. INTRODUCTION neural networks. In this kind of learning, the aim is to infer the mapping implied by the data; the cost function is related to the mismatch between the user’s mapping and the data and it implicitly contains prior knowledge about the problem domain [3]. The meansquared error is often used as the cost and the learning tries to reduce the average error between the network’s output and the desired output. The Backpropagation algorithm is a well-known and efficient algorithm used for training neural networks. Training is accomplished by adjusting the weights on the connections between neurons with an aim to reduce the mean-squared error at the output of the neural network. The Backpropagation algorithm calculates the gradient of a loss function with respect to all of the weights in the network. The algorithm tries to minimize the loss function by feeding the gradient to an optimization method which uses it to update the weights. In order for the Backpropagation algorithm to work, the activation function used by the neurons should be differentiable. The activation function is any mathematical function at the neuron which defines its output for a given set of inputs. The Backpropagation algorithm is very effective in training neural networks, but poses a lot of challenges when implementing it on a standalone hardware system. The algorithm works in two phases; the propagation phase and the weight update phase. In the propagation phase, the algorithm first forward propagates a training input through the network and generates the output activations. In the next step, the algorithm does a backward propagation of the output activations through the network using the target pattern to generate the difference between input and output values of all the hidden and output neurons. In the weight update phase, the algorithm first multiplies the difference obtained with the input activation to find the gradient of the weight. Then it uses this gradient to update each of the weights in the network. It is quite evident that the Backpropagation algorithm though very effective, requires complex multiplication, summation and derivaties that are difficult to implement in VLSI circuits [14]. A simpler algorithm is desirable to design a standalone hardware neural network system. There are several hardware friendly algorithms implemented to train artificial neural networks on hardware. The Random Weight Change algorithm is one 6

1.3. ARTIFICIAL NEURAL NETWORKS ON HARDWARE such popular algorithm. Though not as efficient as Backpropagation, it is hardware friendly and much simpler to implement.

1.3

Artificial Neural Networks on Hardware

Implementation of artificial neural networks on hardware has been popular for over three decades. Hardware neural networks extend from Analog to Digital to FPGA and even to Optical Neural Networks. In this section, we briefly explore a few analog neural network implementations and neural network implementations using memristors.

1.3.1

Analog Neural Network Implementations

Implementation of artificial neural networks on hardware gained popularity in the 1980s with Intel’s Electrically Trainable Analog Neural Network (ETANN) 80170NX chip being one of the earliest fully developed analog chips [4]. The ETANN is a general purpose neurochip that stores its weights on non-volatile floating gate transistors (Floating-gate MOSFET or FGMOS) as electric charge with the help of EEPROM cells, and uses Gilbertmultiplier synapses to provide four-quadrant multiplication. Training for ETANN is done off chip using a host computer and the weights are written into the ETANN [4]. The chip contains 64 fully interconnected neurons and can be cascaded by bus interconnection to form a network of up to 1024 neurons with up to 81,920 weights [15]. Figure 1.4 shows the synapse circuit of the ETANN, which is an NMOS version of the Gilbert-Multiplier with a pair of EEPROM cells in which the a differential voltage is stored as weights. Flower-Nordheim tunneling of electrons is used to add and remove electrons from the floating gates in the EEPROM to adjust the weights [4]. ETANN was used in several systems like the Mod2 Neurocomputer which implemented 12 ETANN chips for real-time image processing [16] and the MBOX II which makes use of 8 ETANN chips to create an analog audio synthesizer [15]. One of the major drawbacks of this chip was the limited resolution in storing the synaptic weights. The long time resolution of the weights was not more than five bits. 7


Figure 1.4: Differential floating gate synapse schematic diagram of Electrically Trainable Analog Neural Network (ETANN) [4].

Another issue was the writing speed and cyclability of the EAROMs used to store the weights which restricted the application of chip-in-the-loop training [17]. Milev and Hristov [5] present a simple analog-signal synapse with inherent quadratic non-linearity implemented using MOSFETs with no floating-gate transistors. They designed a neural matrix for finger-print feature extraction with 2176 analog current mode synapses arranged in eight layers of 16 neurons with 16 inputs each. A chip was fabricated in a standard 0.35µm TSMC process to demonstrate the feasibility of non-linear synapses in practical application. Apart from the 16 x 8 neural-matrix of 128 analog 16-input-neurons, a 16-bit latched digital inputs multiplexed with 16 analog-current inputs and 16 analog-current signal outputs and a 9-bit current-output digital-to-analog converter (DAC) is also implemented on chip. Weight storage done is on an on-chip SRAM of more than 19K size. The architecture allows for cascaded interconnection for system expansion. The internal system clock is specified at 200 MHz maximum frequency. However, the input-data processing speed is determined by current propagation delay through the components in the network and varies significantly with the reference current driving the analog synapse circuits [5]. 8

1.3. ARTIFICIAL NEURAL NETWORKS ON HARDWARE

Figure 1.5: Analog current synapse, synapse current input, weight-control and neuron output circuit schematic of model proposed in [5].

Lui et al. [6] developed a mixed signal CMOS feed-forward neural network chip with on-chip error reduction hardware. The design is compact and capable of high-speed parallel learning using the Random Weight Change Algorithm (RWC). The weight storage in the system is accomplished using capacitors. Capacitors implemented as weights are compact and easy to program, but are susceptible to leakage issues leading to error in the stored weights. In their system, Lui et al. designed large capacitors to ensure the leakage be negligible. The chip is designed to operate in conditions that change continuously,and the weight leakage problem is mitigated by constant weight updates. They found that the weight retention time for the capacitors was around 2s for losing 1% of the weight value at room temperature. Figure 1.6 shows the schematic of a single weight cell with a shift register for random input, the weight storage and modification circuit and the multiplier circuit. Lui et al. were able to fabricated and test a chip with 100 weights and 10x10 array with 10 inputs and 10 outputs. They tested the chip with by connecting it to a PC using an analog to digital converter (ADC) and a digital to analog converter (DAC). In this work we make use of the same RWC algorithm used by Lui et al. in their system. The RWC algorithm 9


Figure 1.6: Schematic of a weight cell of CMOS integrated feed-forward neural network [6]. is described in detail in the next section. The analog neural network implementations discussed in this text is only a small subset of the innumerable VLSI implementations of artificial neural networks. Misra and Saha [15] provide a comprehensive survey of the hardware implementations of artificial neural networks for over 20 years. Their discussion is not limited to analog neural network implementations, but extend to digital, hybrid, FPGA based, RAM based and optical neural networks.

1.3.2

Memristor Based Neural Networks

The potential to mimic brain logic is one of the most attractive feature of the memristor. Various architectures and synapse designs have been proposed using memristors for realizing artificial neural networks. Here, we briefly discuss a couple of neural network implementations using memristors and the Memristor Bridge Synapse based neural network that we have used as the primary reference in our work. Starzyk and Basawaraj [7] propose an architecture and training scheme for neural networks implemented using crossbar connections of memristors with a view of preserving the high density of synaptic connections. They employ simple threshold based neurons, synapse constituting of only a single memristor and a common sensing network. The synapse is designed with a view of creating large scale systems with synapses arranged 10

1.3. ARTIFICIAL NEURAL NETWORKS ON HARDWARE in a grid structure capable of being trained on-chip. The sysyem is composed of a single layer feed-forward neural network with n inputs and m outputs.

Figure 1.7: Excitatory neuron with the input sensing circuit of Memristor Crossbar Architecture for Synchronous Neural Networks [7].

The neuron of the Memristor Crossbar Architecture proposed in [7] operates in three different phases, viz. sensing phase, active phase and resting phase. During the sensing phase, the neuron waits for input activity and does not fire. Increase in any of the input signals above the threshold would switch the neuron into active phase, where the neuron either fires or does not for a specific amount of time. Once the active phase timing expires, the neuron goes into resting phase where all the inputs and outputs go to 0V and remains in this state till the next sampling time. The excitatory neuron with the input sensing circuit of the Memristor Crossbar Architecture is shown in Figure 1.7. The design was tested in HPSICE for organization of the neural network on noisy digit recognition. In [8] Solitiz et al. propose two Neuron Logic Block (NLB) designs to overcome the limitation of not being able train linearly inseparable functions with existing perceptron based NLB designs using thin-film memristors that implement static threshold activation functions. Their designs overcome the limitation by allowing effective activation function to be adapted during learning. Solitiz et al. contribute a perceptron based NLB design with an adaptive activation function, a perceptron based NLB with static activation function and multiple activation thresholds and demonstrate the designs for reconfigurable logic and optical character recognition for hand written digits. 11


Figure 1.8: Weighting and Range Select circuit for RANLB and MTNLB [8].

Figure 1.8 shows the weighting and range selection circuit implemented using memristors for the Robust Adaptive Neural Logic Block (RANLB) and the Multithreshold Neural Logic Block (MTNLB). The RANLB implements an adaptive activation function using the circuit in Figure 1.9 (a), by providing an adjustable digital value for each input current range. A flip-flop stores the digital value for each input current range. The MTNLB is designed with a view of overcoming the high area overhead of the RANLB’s activation function which limits its implementation on large neural networks where area is a primary constraint. The MTNLB employs a static activation function in such a way that the ability to learn linearly inseparable functions is not compromised. Figure 1.9 (b) shows the activation function circuit for the MTNLB circuit.

Figure 1.9: (a) Activation function circuit for RANLB. (b) Activation function circuit for MTNLB [8]. 12

1.3. ARTIFICIAL NEURAL NETWORKS ON HARDWARE

Figure 1.10: Circuit that accomplishes weighting using the Memristor bridge synaptic circuit and voltage-to-current conversion with differential amplifier in [9]. The Memristor Bridge Synapse introduced by Kim et al. in [10] is a very popular synaptic design used to implement neural networks. [9], [18], [19] and [20] present implementations of the Memristor Bridge Synapse in artificial neural networks. In our work, we build on the work presented in [9] by Adhikari et al. on neural networks constructed using Memristor Bridge Synapse that involves the Random Weight Change algorithm for training. Each neuron in the Memristor Bridge Synapse based neural network in [9] is composed of multiple synapse and one activation unit. The inputs to the neural network are supplied as voltage values which are weighted and then converted to current by differential amplifiers. Kirchhoff Current Law (KCL) is used to sum the currents and produce the output of a neuron. The differential amplifier along with the active load circuit form the activation unit of the neuron. Figure 1.10 shows the Memristor Bridge Synapse connected to the differential amplifier circuit. Figure 1.11 (a) shows a simple neural network with two neurons and Figure 1.11 (b) shows the equivalent hardware circuit for the neural network in Figure 1.11 (a) along with the architecture for the training regime. Adhikari et al. designed and simulated the differential amplifier and the active load circuit in HSPICE and developed a look-up table from the results. The Memristor model, error calculation, random number generation and training pulse application were simulated in MATLAB. They tested the architecture to learn the 3-bit parity problem, a Robot 13


Figure 1.11: (a) Typical multi-layered neural network inputs in voltage form. (b) Schematic of learning architecture for the equivalent hardware for the neural network in (a) [9].

workspace and face pose identification using neural networks with 3 input x 5 hidden x 1 output, 10 input x 20 hidden x 1 output and 960 input x 10 hidden x 4 output nodes respectively in MATLAB [9]. Their aim was to show that the Memristor Bridge Synapse based neural networks trained using the Random Weight Change algorithm could be used to realize simple, compact and reliable neural networks that are capable of being used for real-life applications. In our work, we have used the Memristor Bridge Synapse based neural networks described in [9] as the base and try to build a complete hardware architecture which can be implemented on a chip. We have made several modifications to the architecture presented in [9], but have used the Memristor Bridge Synapse as the primary component of the system along with the application of the RWC algorithm for training. The RWC 14

1.4. RANDOM WEIGHT CHANGE ALGORITHM algorithm and the circuit implementation of the Memristor Bridge Synapse are discussed in detail in the following sections.

1.4

Random Weight Change Algorithm

The Random Weight Change (RWC) algorithm was first described by Hirotsu and Brooke in 1993. They proposed the algorithm as an alternative to Backpropagation to eliminate the need for complex calculations while training a neural network. The non-idealities of analog circuits is another reason why Backpropagation is not preferred for hardware implementations. They were able to successfully implement and test the algorithm on a chip with 18 neurons and 100 weights which learned the XOR Gate problem [14].

Figure 1.12: Flowchart for Random Weight Change Algorithm. The algorithm randomly changes all of the weights by a small increment of -δ or +δ from its initial state. The training input is then supplied to the network and the output 15


Figure 1.13: Illustration of energy surface tracing by back-propagation and random weight change algorithm [9].

error is calculated. If the new error has reduced compared to the previous iteration, the same weight change is done again, until the output error either increases or falls to within a desired limit. If the output error increases, then the weights are updated randomly again. The algorithm can be summarized using the following equations from [14]: wij (n + 1) = wij (n) + ∆wij (n + 1)

(1.3)

where, ∆wij (n + 1) = ∆wij (n) if E(n + 1) < E(n) ∆wij (n + 1) = δ ∗ Rand(n) if E(n + 1) ≥ E(n) E() is the root mean-squared error at the output, δ is a small constant and Rand(n) = +1 or -1 randomly. The flowchart in Figure 1.12 illustrates the steps in the Random Weight Change algorithm. The Random Weight Change algorithm is less efficient when compared to Backprogation. Figure 1.13 shows a comparison of the RWC algorithm with Backpropagation. For Backpropagation, the operating point goes down along the steepest slope of the energy 16

1.5. MEMRISTOR BRIDGE SYNAPSE

Figure 1.14: Memristor Bridge Synapse Circuit [10].

curve in the network. For RWC algorithm, the operating point goes up and down on the energy curve rather than descending straight along the energy curve. However, RWC’s operating point statistically descends and finally reaches the correct answer [14]. The RWC algorithm is very effective for analog implementations of artificial neural networks as it eliminates the need for complex circuitry and is not greatly affected by circuit non-idealities. Moreover, the algorithm does not require any specific network structure and can be applied to all feed-forward neural networks. Fully connected feedback networks may have local minimum problems [14].

1.5

Memristor Bridge Synapse

The Memristor Bridge Synapse is a Wheatstone Bridge like circuit that is composed of four identical memristors. Figure 1.14 shows the arrangement of the memristors in the Bridge Synapse. The memristor are arranged such that the polarities of memristors M1 and M4 are the same and opposite to that of M2 and M3. When a positive voltage is supplied at Vin M1 and M4 are forward biased, which leads to the decrease in their resistances. M2 and M3 on the other hand become reverse biased and their resistance increases [10]. The outputs of the Bridge Synapse are tapped out at the nodes A and B. The Bridge Synapse basically acts as two voltage divider circuits. The voltage at the 17

CHAPTER 1. INTRODUCTION nodes A and B are given by the simple voltage divider formula:

VA = (

M2 ) ∗ Vin M1 + M2

(1.4)

VB = (

M4 ) ∗ Vin M3 + M4

(1.5)

where M1 , M2 , M3 and M4 are the resistance of the memristors M1, M2, M3 and M4 respectively. The weight of the Memristor Bridge Synapse is the difference in the voltage VA and VB . Initially, when all the memristors are in the same state, the node VA and VB will have the same value. The synaptic weight of the Bridge Synapse is described by the following expressions from [10]: positive synaptic weight if, M4 M2 > M1 M3 negative synaptic weight if, M2 M4 < M1 M3 zero synaptic weight if, M2 M4 = M1 M3 The output of the Bridge Synapse can be modelled by the equation

Vout = ψ ∗ Vin

(1.6)

where ψ is the synaptic weight defined by,

ψ=

M2 M4 − M1 + M2 M3 + M4

(1.7)

The Memristor Bridge Neuron is implemented by summing the output signals from different Bridge Synapses. Differential amplifiers are used to process the weighted inputs from primary inputs or other neurons. The implementation of the Bridge Neuron is described in Chapter 2. 18

1.6. THESIS STATEMENT

1.6

Thesis Statement

Since the physical realization of the memristor by HP labs in 2008, the research on memristors and its applications have been constantly gathering pace. The potential of memristors in realizing simple and fast neuromorphic circuits is immense. As the lithographic process for fabricating memristors evolve, architectures and tools for circuit realization also need to evolve. Majority of the research on memristor based neural networks have thus far focused on various algorithms and methodologies for implementation. The Bridge Synapse based artificial neural network presented in [9] shows a lot of promise for practical implementation because of the simplicity in its design. In the work presented in [9], the authors focused on illustrating the simplicity and effectiveness of using the Memristor Bridge Synapse in tandem with the Random Weight Change algorithm for neural network implementations. They proposed to use the Memristor Bridge Synapse as the weighting element of the neural network to which inputs were applied as voltage pulses. At the neuron level, voltage-to-current conversion was achieved using differential amplifiers to take advantage of Kirchhoff Current Law to sum the inputs of the neurons. The differential amplifier along with the active load circuit form the activation unit of the neuron. In [9], the authors tested their design by first simulating the differential amplifier and the active load circuit in HSPICE and creating a look-up table which was then used for training the neural network in MATLAB. The neural network circuit was created in MATLAB using a memristor model. The error calculation and random number generation was done by MATLAB code and the weight updates were done by changing the resistance of the memristors in the bridge synapse based on the random numbers. They successfully trained neural network for 3-bit parity problem, learning robot workspace and for face pose identification. Although [9] proves that neural networks using the Memristor Bridge Synapse for weighting along with the RWC algorithm for training is a good approach for real-life applications, a path for an actual realization of a chip was not described. Moreover, 19

CHAPTER 1. INTRODUCTION on-chip training requires additional circuitry and timing becomes critical. In our work, we focus on developing an architecture that can efficiently implement the RWC algorithm and Memristor Bridge Synapses to create a hardware neural network the can be trained completely on chip. We have made modifications to the design of the neuron and activation function in [9], but the training algorithm and weighting methodology remains the same. Our architecture is composed of the neural network circuit realized using Memristors and differential amplifiers. The architecture also incorporates a microcontroller, which is responsible for measuring and calculating the output error and supplying the random training signals and timing signals to the neural network. We designed and implemented circuits to supply the random inputs and apply them to each individual Memristor Bridge Synapse during training. We also developed a placement and routing tool to realize the architecture on a physical layout. The tool takes the number of inputs, hidden layers and outputs as its input and generates a physical layout with interconnections between neuron blocks on different layers. Since layout libraries for memristors are not available yet, the placement and routing tool is only a prototype to illustrate how the architecture would appear on a layout and to gather an approximation of the area occupied by a specific network. Majority of the simulations in this work were performed using HSPICE. Spice level simulations are the best available approximations to actual circuit behavior in hardware. Simulations were performed for individual components of the architecture and complete neural network circuits. We also developed a simulator to train a small neural network in HPSICE using Perl. Perl mimicked functions of the microcontroller such as supplying random inputs, clock signals etc. by generating PWL inputs to the HPSICE circuit. A neural network with 2 inputs, 3 hidden layer neurons and 1 output layer neuron successfully learned the OR-gate function in HSPICE. The aim of our work was to develop an architecture suitable for implementing memristor based neural networks on chip. With the core of the neural network implemented in HSPICE using real components and only minimal functionality simulated using software, 20

1.7. THESIS OVERVIEW we were able to show that our architecture is well suited to be realized on a chip.

1.7

Thesis Overview

The remainder of this document is organized in the following manner: Chapter 2 discusses the architecture for implementing neural networks with the Memristor Bridge Synapse. The Chapter describes the various components of the architecture and their functions. An overview of the functioning of the neural network and the bit-slice design of the synapse is also presented in this Chapter. The placement and routing tool for the architecture layout is described in Chapter 3. This Chapter explains the algorithm and the implementation of the tool and presents and discusses the output. The Chapter also discusses how the tool is designed to produce layout for varying number of neurons and neural layers. Chapter 4 describes the experimental setup, observations and analyzes the results of the experiments conducted at different abstractions of the neural network design. All components of the neural network are simulated both individually and and as full circuit. The power calculations and estimations for neural network training and normal operation are also presented in this Chapter. The conclusions drawn from this thesis and future work are described in Chapter 5.

21

Chapter 2 The Memristor Neural Network Architecture The primary focus of this thesis is to develop an efficient hardware architecture to implement the memristor based artificial neural networks described in [9]. This Chapter focuses on describing our architecture and the various components of the neural network system and their functions. The architecture is best explained with the help of examples. In this thesis, we have used two different neural networks for simulations at different levels of abstraction. A small neural network that aimed to learn the OR-Gate problem was used in simulations to verify the functionality of the Memristor Bridge Synapse and other components and the entire architecture at the SPICE level. A much larger neural network for face pose identification explained in [9], was simulated using Python to verify the functioning of the large Memristor Bridge Synapse based neural networks for more practical applications.

Figure 2.1: Sample input for face pose identification problem [11].

22

2.1. ARCHITECTURE OVERVIEW

2.1

Architecture Overview

Image recognition is a popular application of artificial neural networks and the memristor bridge synapse based artificial neural networks are efficient in learning functions of this kind. We illustrate the working of the neural network architecture using face pose identification problem discussed in [9]. The sample inputs to the neural network for the face pose identification problem is shown in Figure 2.1. The images for face recognition are available for download from CMU [11]. In this problem, the network tries to learn the direction in which the face of the subject in the image is oriented. There are four face poses that the networks aims to learn; left, right straight and up as depicted in Figure 2.1 (a) through (d). The images are greyscale with 32x30 resolution. Figure 2.2 shows a representation of the neural network used for this problem.

Figure 2.2: Three layered neural network for face pose identification. 23

CHAPTER 2. THE MEMRISTOR NEURAL NETWORK ARCHITECTURE The network has a total of 960 (32*30) inputs, 10 hidden layer neurons and 4 output neurons. Every neuron in one layer is connected to every neuron in the succeeding layer. The network consists of a total of 9640 memristor bridge synapses. The circuit produces an output of [1 -1 -1 -1], [-1 1 -1 -1], [-1 -1 1 -1] and [-1 -1 -1 1] for left, right, straight and up orientations of the subject’s face. In Figure 2.2, the input layer neurons are only a representation of the fan-out of the external inputs to multiple memristors bridge synapses. No function is applied to the inputs at the input layer neurons.

Figure 2.3: Memristor based neural network architecture for face pose identification. For this neural network design, we can see that the number of hidden and output layers are smaller in number compared to the input layer. In this particular example of face pose identification, the number of input neurons is almost 100 times the hidden neurons, and the number of output neurons is less than half the number of input neurons. A chip for such a neural network can have close to 1,000 pins and the architecture in Figure 2.3 is designed keeping the constraint of connecting the pins to internal signals in mind. Since all inputs go to all neurons, each neuron block in the middle layer receives the inputs from a bus. A neuron block consists of as many Memristor Bridges as inputs to the block (960 Memristor Bridge Synapses in this example) and three operational amplifiers 24

2.2. ARCHITECTURE COMPONENTS circuits, two for summing and one for difference. The middle layer neuron blocks are placed close to the periphery of the chip on three sides and the output is drawn out from the fourth side. The input layer bus (consisting of 960 wires in the example) is placed around the middle layer neuron blocks. This way, the inputs from the pins can easily be supplied to the bus and the bus lines can be conveniently accessed by each neuron block. The middle layer bus will have as many lines as middle layer neuron blocks (10 in this case). The output of each neuron in the middle layer is connected to the bus and supplied to output layer neuron blocks. The outputs of the output layer neuron blocks are connected to the microcontroller, which reads the values generated by the network and calculate the error and perform training. The outputs can also be tapped out through other pins on the chip.

2.2

Architecture Components

With respect to the description of the architecture in Figure 2.3, the components of the neural network can be categorized as internal and external to the neuron block. The components that are external to the neuron blocks are the connection buses and the microcontroller. We first describe the components internal to the neuron block and then move onto the components external to it. We describe the internal components of the neuron block with the help of a simple neural network. Figure 2.4 shows an artificial neural network with two input layer neurons, two hidden layer neurons and one output layer neuron. The aim of this neural network is to learn the OR-Gate function. The training inputs are applied through the nodes IN1 and IN2. There are a total of five neurons, N1 through N5 and six memristor bridges, BR1 through BR6 in this network. Each neuron in one layer is connected to every neuron in the succeeding layer. The neurons N1 and N2 are only a representations and do not apply any function on the inputs. The applied inputs fan out from N1 and N2 neurons to different bridges. For example, the input supplied at IN1 fans out to 25

CHAPTER 2. THE MEMRISTOR NEURAL NETWORK ARCHITECTURE bridges BR1 and BR2. Each memristor bridge produces two output components, the VA component and the VB component. These components are represented by the two lines that originate from each bridge synapse and go into the neuron where summing logic is implemented.

Figure 2.4: Simple three-layered neural network.

2.2.1

Neuron Block

2.2.1.1

Memristor Bridge Synapse

The Memristor Bridge Synapse is the primary component of the neural network and takes up most area on the chip. Each memristor is about 50 nm x 50 nm wide. A single memristor bridge requires about 200 nm x 200 nm of area after including the routing between the memristors. The biggest network simulated in this work has almost 10,000 bridge synapses. Figure 2.5 shows the design of a Memristor bridge synapse. The input to the memristor is applied from one end (node IN ) and the other end is tied to ground. As discussed in Chapter 1, the two memristors connected on either side of node A are arranged such that one of them will be forward biased and the other reverse biased when a voltage is applied at node IN. The same logic applies to the memristors connected on either side of node B, the only difference being that their orientation with respect to node IN is opposite to that of the other two memristors on either side of node A in the bridge. The nature of this arrangement ensures that the voltage drop at either node A or node B will be greater than the other. Because of this arrangement, when the voltage drop at one node 26

2.2. ARCHITECTURE COMPONENTS

Figure 2.5: Memristor Bridge Synapse design.

increases, drop at the other node decreases. It also ensures that the total resistance of the memristor bridge is a constant and brings a symmetry to the weight supplied by the bridge. The weight supplied by the bridge synapse is the difference between the node voltages (VA − VB ). The weight is changed by supplying either a positive or negative voltage at IN. For the bridge in Figure 2.5, a positive voltage pulse at IN will result in the decrease in the resistances of the memristors M1 and M4, and an increase in the resistance of M2 and M3. Consequently, the voltage drop at A will increase and voltage drop at B will decrease as explained using equations 1.4 and 1.5. On the contrary, if a negative voltage pulse is applied at IN, the voltage drop at A will decrease and that at B will increase. The weight supplied will either be positive or negative depending on whether VA or VB is greater. It is interesting to note that both the evaluation and training pulse are applied through the same node to the memristor bridge. The question arises how would the evaluation input affect the resistance of the bridge and in turn, the weight of the bridge if they are both applied from the same node. From experiments conducted, we observed that if the pulse width of the input is within 1 ms, it does not bring any notable change to the resistance of a memristor. Moreover, to ensure that the evaluation pulse does not alter the resistance of the bridge synapse, the evaluation pulse is supplied as a complement, 27

CHAPTER 2. THE MEMRISTOR NEURAL NETWORK ARCHITECTURE e.g. if an input to one of the inputs of the neural network is +1V, a -1V is applied for a the duration as the +1V input during evaluation to reverse any change to the resistance caused by the input pulse. To change the resistance of a memristor by 40 Ω , a pulse of width 250 µs was required. The experiments and observations are described in detail in Chapter 4. All connections between neurons in the network are established using the memristor bridge synapse. While training is in progress, each memristor is applied a training pulse based on the random number that was generated for it. The circuitry for applying random pulses will be discussed in a later section. The activation function at the neuron is implemented by the operational amplifiers using a summing logic. The neuron receives its input from the various bridges that are connected to it. Each bridge supplies two voltage components (VA and VB ). The neuron first sums these two components individually and then evaluates the difference between the two sums.

Figure 2.6: Summing logic for neuron N3 from Figure 2.4.

Figure 2.6 shows the summing logic implementation of neuron N3 from the circuit in Figure 2.4. Each bridge synapse has two output components, the VA component (voltage from node A) and VB component (voltage from node B ). At the neuron, the individual VA and VB components are summed together first. After the summing is complete, the difference of these individual summed values is evaluated. This evaluated voltage value 28

2.2. ARCHITECTURE COMPONENTS will be the output of the neuron.VA SU M and VB SU M in Figure 2.6 are evaluated by summing the VA components and VB components from memristor bridges BR1 and BR3. After the summation, the difference is evaluated by subtracting VB SU M from VA SU M . The difference gives the output N3OUT of neuron N3. Both the summing and difference logic is implemented with the help of operational amplifiers. Each neuron contains three operational amplifiers, two for each summing circuit and one for the difference circuit. The implementation of the summing and difference circuit are explained in the following section.

2.2.1.2

Summing Amplifier

The summing operation is implemented by using a voltage average circuit along with an operational amplifier as depicted in Figure 2.7. Note that the resistors used along with the amplifier circuits are normal resistors and not memristors. The memristors are used only to design the bridge synapses. The voltage averaging is accomplished by

Figure 2.7: Summing circuit using voltage average and operational amplifier circuits. connecting the input voltages to resistors of resistance R. The other end of these resistors are connected to the same node. For example, in Figure 2.7, the voltages VA from BR1 and BR3 are connected to two resistors of resistance R. Now, the voltage at node S1 will be the average of the two input voltages. To get the sum from the averaged voltage, the voltage at node S1 needs to be multiplied with the total number of inputs to the summing circuit. This accomplished by adjusting the gain of the operational amplifier. In the circuit it Figure 2.7, the operational amplifier is in the non-inverting configuration, 29

CHAPTER 2. THE MEMRISTOR NEURAL NETWORK ARCHITECTURE whose gain is determined by the two resistors R1 and R2. The gain for this particular amplifier is two, since there are two inputs to the summing circuit. VA SU M will automatically be generated after the circuit receives the input voltages. For a neuron which has n inputs, the operational amplifier will be configured to have a gain of n. The gain of the transistor once fixed, does not have to be altered during the operation of the neural network. An important point to note here is that the output, VA SU M of the summing circuit is limited by the supply voltage to the differential amplifier. In the case of the circuit in Figure 2.7, the output voltage will be within -1V to +1V.

2.2.1.3

Difference Amplifier

Figure 2.8: Difference circuit using differential amplifier.

The implementation of the difference amplifier is much more straightforward. The differential amplifier circuit used for this operation is shown in Figure 2.8. It is configured with a gain of 1 and the input VA SU M is supplied to the non-inverting input and VB SU M sum is supplied to the inverting input. All the resistors in the circuit are the same value. The circuit essentially does the operation N3OUT = VA SU M - VB SU M . 30

2.2. ARCHITECTURE COMPONENTS

2.2.2

Microcontroller

The microcontroller is one of the key components of the architecture. It is responsible for implementing the training algorithm by supplying all the necessary signals to the memristor bridge synapses and the neurons. The microcontroller additionally generates the random numbers required to train the weights of the bridge synapses. 2.2.2.1

Signals Generated by the Microcontroller

The microcontroller contains the logic for generating the control signals that are required to train and operate the neural network. There are three control signals: update/evaluate, evaluate, shift in and clk. 1. update/evaluate: This signal decides whether the neural network is in weight update or evaluation mode. When update is high (update = 1), the network is in weight update mode. The microcontroller supplies this signal to activate the weight update process by enabling the +1V and -1V power rails. When the signal is low, the network is either evaluating its output using the supplied external input, or is in an idle state. When the network is in idle state, all bridge synapses and neurons are undriven. The update signal is also used to isolate the memristor bridges from the operational amplifiers during the weight update phase. The isolation of the operational amplifiers is very important to ensure that the training pulse on one memristor bridge synapse is not propagated forward to the next layer. This is done by disabling the input power rails to the differential amplifier through power gating. 2. shift in:The random numbers for each memristor bridge synapse are supplied using this signal. Each bridge requires a random signal (either 0 or 1) and this random number is generated and supplied by the microcontroller. The random numbers are passed on to a shift register that are connected to the bridge synapse. Each bridge synapse will have one D flip-flop associated with it to supply the random number for training. The random numbers are supplied to the shift register through the 31

CHAPTER 2. THE MEMRISTOR NEURAL NETWORK ARCHITECTURE shift in line. There may be more than one shift in line depending on the size of the neural network and the number of shift registers implemented. 3. clk : The clk is the global clock supplied to the entire neural network and is used for supplying the random numbers to all the flip-flops in the network. This signal is activated only when the random input are supplied to the neural network for training.

2.2.2.2

Functions of the Microcontroller

The microcontroller is the central component of the architecture. It is responsible for supplying the input, training signals, updating weights and evaluating the output of the network. It ensure that all the components in the network function in a synchronized manner.

Figure 2.9: Neuron N3 inputs and output for the neural network in Figure 2.4.

1. Synchronizing Input Application: The microcontroller enables and disables the application of external inputs to the neural network. The external inputs are supplied to the network for only a very short period of time (

Scalable Hardware Architecture for Memristor Based Artificial Neural ...

Scalable Hardware Architecture for Memristor Based Artificial Neural ...

Suggest Documents

EFFICIENT SCALABLE HARDWARE ARCHITECTURE FOR ...

a scalable parallel hardware architecture for ...

COMMERCIAL HARDWARE FOR ARTIFICIAL NEURAL NETWORKS

COMMERCIAL HARDWARE FOR ARTIFICIAL NEURAL NETWORKS ...

COMMERCIAL HARDWARE FOR ARTIFICIAL NEURAL NETWORKS ...

Memristor Crossbar-based Hardware ... - Semantic Scholar

Memristor Crossbar-based Hardware Implementation of ... - CiteSeerX

A Scalable Correlator Architecture Based on Modular FPGA Hardware ...

design of artificial neural network architecture for

ShareStreams: A Scalable Architecture and Hardware ... - CiteSeerX

Scalable Massively Parallel Artificial Neural Networks

Scalable Massively Parallel Artificial Neural ... - Semantic Scholar

A Scalable Hardware Architecture for Prime Number Validation

A Scalable and Hardware-Efficient Architecture for ... - CiteSeerX

1 a scalable hardware architecture for high ...

A Scalable Hardware Design Architecture for ... - Semantic Scholar

A Scalable Architecture and Hardware Support for High ... - CERCS

CMOS and Memristor-Based Neural Network Design for Position ...

FPGA-based Parallel Hardware Architecture for Real

A Component-Based Architecture for Scalable

Anti-synchronization for stochastic memristor-based neural networks ...

A New Method for Hardware Implementation of Artificial Neural

A Pipeline Hardware Implementation for an Artificial Neural Network

Architecture of reconfigurable artificial neural network ...