Aug 11, 2017 - Learning. Deep Learning. DNNs. Multi-Layer. Neural Networks. Recurrent. Neural Networks ... DIC tth@2017. 11. Google Machine Learning Machine (TPU 2017) .... CMOS IN. HDMI OUT. SDSoC --all C/C++ programing c ...
Learning Machine Learning by Design an experience sharing using Xilinx SoC T. Hui Fellow of Institution of Engineers Singapore (FIES) OpenHW2017@107
Who we are? Architecture & Sustainable Design
Electrical Engineering
Engineering Information Product Engineering Systems Development Systems Technology & Design & Design
Mechanical Engineering Humanities, Arts, & Social Sciences
4/8/2017
Materials
Design Science
DIC tth@2017
2
Design project background • 2D design in Term-8 courses – Digital Integrated Circuits, Electrical Power System, • A term product design projects – 10 weeks duration, • Highly independent product design project, • A group product design project, • A working prototype, • Budget SG$600.00, use minimum possible as part of the evaluation criteria.
4/8/2017
DIC tth@2017
3
Design goal formulation Customer: faculties
Business: education
Customer: Year-1/2
Business: SUTD
Product: SoC
Product / Service?
AI
Smart campus
Technology?
Technology: ML
Convergence 4/8/2017
Divergence
Convergence DIC tth@2017
Divergence 4
Project deadline May 22nd 2017
August 11st 2017 4/8/2017
DIC tth@2017
5
What are our resources?
4/8/2017
DIC tth@2017
6
Where we are? Artificial Intelligence (AI) Computer Vision
Pattern Recognition
Linear Regression
K-Means Clustering
Multi-Layer Neural Networks
4/8/2017
...
Machine Learning
Deep Learning DNNs
Convolutional Neural Networks
DIC tth@2017
Cognitive Robotics
...
Decision Trees
...
Fuzzy Systems
Reinforcement Learning
Recurrent Neural Networks
7
[https://www.cbinsights.com/research/artificial-intelligence-top-startups/] 4/8/2017
DIC tth@2017
8
Why learning Machine Learning?
[https://www.forbes.com/sites/moorinsights/2017/03/03/a-machine-learning-landscape-where-amd-intel-nvidia-qualcomm-and-xilinx-ai-engineslive/#709344e9742f] 4/8/2017 DIC tth@2017 9
What ML hardware are available?
[https://www.forbes.com/sites/moorinsights/2017/03/03/a-machine-learning-landscape-where-amd-intel-nvidia-qualcomm-and-xilinx-ai-engineslive/#709344e9742f]
4/8/2017
DIC tth@2017
10
Google Machine Learning Machine (TPU 2017)
4/8/2017
DIC tth@2017
11
Microsoft Machine Learning Machine
4/8/2017
DIC tth@2017
12
Learning Machine Learning by SoC • Machine learning
• Artificial Intelligence, Learning (Supervised, Unsupervised, reinforcement), … • Artificial Neural Network (ANN) • Convolution, activation function, perceptron, backpropagation, ANN, Convolution Neural Network, …
• ARM microprocessor
• Architecture, instruction set, memory, multi-core, …
• System On Chip
• Hardware / software co-design
• Training and Inference
• GPU – training (NVidia GPU can help in) • SoC – inference (Using Zynq from Xilinx - smart, distribute, standalone)
• Integrated Circuits Design
• ASIC for Machine Learning Microprocessor – ultimate goal and practical implementation for dedicated ML𝜇P
4/8/2017
DIC tth@2017
13
Example of dedicated ML𝜇P A 2.9TOPS/W Deep Convolutional Neural Network SoC in FD-SOI 28nm for Intelligent Embedded Systems, ISSCC2017. ST Microelectronics
4/8/2017
DIC tth@2017
14
Example of dedicated ML𝜇P A 288μW Programmable Deep-Learning Processor with 270KB On-Chip Weight Storage Using Non-Uniform Memory Hierarchy for Mobile Intelligence, ISSCC2017. CubeWorks
4/8/2017
DIC tth@2017
15
Design goal formulation Cost: < 28 nm, $$$
ML𝜇P Time: 10 Weeks
Convergence 4/8/2017
DIC tth@2017
Divergence 16
Why SoC? • System on Board
4/8/2017
• System on Chip
DIC tth@2017
17
Why SoC? SoB (burger)
4/8/2017
SoC (bun)
DIC
18
Design goal formulation Video processing
Deep learning
ML CNN
Image processing
ANN
Convergence 4/8/2017
Divergence
Convergence DIC tth@2017
Divergence 19
Design with Zynq – Image and Video Processing, and Computer Vision Huge parallelism for image processing. Full High Definition (HD): 1920 x 1080 pixels = 2,073,600 pixels, 3 channels per pixel for color, 8 bits per channel (R, G, B), = 24 bits per pixel. 1920 x 1080 pixels = 49,766,400 bits per single HD image.
Image representation in textual, numeric description. Training set.
Zynq-PS: NEON
Identification of lines, curves, shapes and regions. Hough Zynq-PS transform, color identification, thresholding, morphology. Pre-processing: adjustments of color balance, contrast, edge. Sobel filter. 4/8/2017
DIC
Zynq-PL
20
What is the platform? EagleGo
4/8/2017
DIC tth@2017
21
Convolutional Neural Network (CNN)
[http://www.nature.com/nature/journal/v521/n7553/fig_tab/nature14539_F1.html]
4/8/2017
DIC tth@2017
22
Artificial Neural Network (ANN) 𝟏
𝒃
𝒙𝟏
𝒘𝟏
Perceptron
𝒘𝟐 𝒙𝟐
𝒚 𝑦 = 𝐬𝐠𝐧(σ𝑗(𝑤𝑗 ⋅ 𝑥𝑗 ) + 𝑏)
Classification
Sum
𝑥2
𝒚=𝟏 𝒚=𝟎
1,1 0,1
0,0
4/8/2017
AND Perceptron Decision Boundary
𝑥1
1,0 𝑥2 = −𝑥1 + 1.5
DIC tth@2017
23
ANN training by backpropagation Forward Propagation: find the outputs
𝑥1
𝑤1
𝑦1
𝑏1 |𝑥ℎ1
𝑤2
𝑤ℎ1
Update of hidden layer’s weight depends on • learning rate, • error, • hidden layer gradient, • hidden layer input.
𝑦ℎ 𝑏ℎ |𝑧𝑐ℎ
𝑤3 𝑥2
𝑤4
Back Propagation,
4/8/2017
𝑦2
𝑏2 |𝑥ℎ2
𝜕𝜉 , 𝜕𝑤ℎ𝑗
𝑤ℎ2
Error, 𝜉 =
1 2
𝑧ℎ𝑡 − 𝑧ℎ𝑐
2
update the weights / bias
DIC tth@2017
24
ANN training by backpropagation Delta Rule: • Learning from mistakes • “Delta”: difference between targeted and calculated output Error ∝ target zh − calculated zh 1 2 𝜉 = 𝑧ℎ𝑡𝑗 − 𝑧ℎ𝑐𝑗 Delta rule 2
𝜉=
1 𝑧 − 𝑧ℎ𝑐 2 ℎ𝑡
2
𝜕𝜉 = −(𝑧ℎ𝑡 − 𝑧ℎ𝑐 ) 𝜕𝑧ℎ𝑐
𝑗
Gradient update 𝑖 Δ𝑤ℎ = 𝑤ℎ𝑗 − 𝑤ℎ𝑗 = −𝛼
𝜕𝜉 𝜕𝑤ℎ𝑗
𝜕𝜉 𝜕𝜉 𝜕𝑧ℎ𝑐 𝜕𝑦ℎ = ⋅ ⋅ 𝜕𝑤ℎ𝑗 𝜕𝑧ℎ𝑐 𝜕𝑦ℎ 𝜕𝑤ℎ𝑗
Gradient descent
4/8/2017
DIC tth@2017
25
ANN – Example For example, base on the pass exam data in the following table, predict the Final pass/fail for a student who has Study Hours=25 hours, and Mid-term Test=70. Study Hours
Mid-term Test
Final
35
67
1 (pass)
12
75
0 (fail)
16
89
1 (pass)
45
56
1 (pass)
10
90
0 (fail)
Training of the ANN to get error approaching zero, the ANN is then used to predict the desire Input (25, 70)
Trained
Trained
x1 x2 w1 w2 w3 w4 b1 b2 y1 y2 xh1 xh2 wh1 wh2 bh yh zch zth Err 0.25 0.70 4.40 22.65 1.86 3.74 -2.93 -6.44 -0.53 1.84 0.37 0.86 -4.31 19.92 -5.46 10.14 1.00 1.00 0.00
Desired input (not in database) Inference: Using trained model to predict/estimate outcomes from new observations. 4/8/2017
DIC tth@2017
Calculated output Predicted output 26
Concept question: Image
0 Sharpen 0 0 0 0 0 0 0
0 3 0 1 1 3 0 0
0 1 6 7 7 7 3 0
0 3 6 0 1 7 1 0
0 1 7 1 3 6 1 0
Filter
0 3 7 7 7 5 3 0
0 1 1 3 3 3 1 0
0 0 0 0 0 0 0 0
Max Pooling
0 -1 0 14 -7 7 (x) -1 5 -1 = -10 16 14 0 -1 0 -3 21 -15 -6 19 -12 7 15 20 -6 7 -6
-8 20 -12 0 14 -5
6 17 17 17 6 8
1 16 20 17 -6 -> 21 0 17 4 15 20 8 2 6 -1
Average Pooling
3.25 8.25 4.5 7.75 -9.8 10 5.75 5.75 4.8
1. Perform Convolution on the image using Sharpen filter 2. By changing one of the number in the filter to reduce the number 14 at the top left corner 3. Perform Average Pooling
4/8/2017
DIC tth@2017
27
Concept question: Image
0 Sharpen 0 0 0 0 0 0 0
0 3 0 1 1 3 0 0
0 1 6 7 7 7 3 0
0 3 6 0 1 7 1 0
0 1 7 1 3 6 1 0
Filter
0 3 7 7 7 5 3 0
0 1 1 3 3 3 1 0
0 0 0 0 0 0 0 0
Max Pooling
0 -1 0 8 -13 (x) -1 5 -1 = -17 16 0 -1 -1 -10 20 -13 12 4 14 -6 7
0 13 -18 -18 19 -6
-15 13 -19 -5 11 -5
5 14 14 14 5 8
1 16 13 14 -6 -> 20 -5 14 4 14 19 8 2 6 -1
Average Pooling
-1.5 2.75 3.5 2.25 -15 8.5 4.75 4.75 4.5
1. Perform Convolution on the image using Sharpen filter 2. By changing one of the number in the filter to reduce the number 14 at the top left corner 3. Perform Average Pooling
4/8/2017
DIC tth@2017
28
Summary of background knowledge study • It is good to know the theoretical background, • However, the timeline gives no point to design your own net (advanced project otherwise).
4/8/2017
DIC tth@2017
29
Design goal formulation Caffe Tensorflow
Training
DNN
...
? Inference
Convergence 4/8/2017
Divergence
Convergence DIC tth@2017
Divergence 30
Basic setup SDSoC --all C/C++ programing c
rgb2gray
sharpen
CMOS IN
4/8/2017
sobel_filter
HDMI OUT
DIC tth@2017
31
Example project-1: attendance
4/8/2017
DIC tth@2017
32
Example project-1
Training set preparation – limited set Increase the training set by: • Rotation (90°, 180°) • Distortion (Single value decomposition) • Filters (Grey scale)
4/8/2017
DIC tth@2017
33
Example project-1
Workflow SDSoC
Video for Camera Input
White Balance Filter for Input Image
Compare Frame to Network
Frame Converted to Picture
Matching
4/8/2017
DIC tth@2017
34
Example project-2: marks detection • Binarized Neural Network (BNN) – in Zynq
4/8/2017
DIC tth@2017
35
Learning Machine Learning by Design End of Presentation – thanks to Project-1: Amos Ho | Andrew Sng | Sabareesh Nair | Stanley Loh | Threvin Anand | Yap Pin Yaw Project-2: Jiong Le|Jien Yi