Jointly Optimize Data Augmentation and Network

Jointly Optimize Data Augmentation and Network Training Adversarial Data Augmentation in Human Pose Estimation Xi Peng*†, Zhiqiang Tang*†, Fei Yang §, Rogerio Feris ‡, Dimitris Metaxas † Rutgers, The State University of New Jersey † Facebook § IBM T.J. Watson Research Center ‡ * Contributed equally, project page: https://sites.google.com/site/xipengcshomepage/cvpr2018

Highlights

Motivation:

Approach

Results (a) Network architecture

Target Network

Random

Loss/ PCKh

Random data augmentation

256C

Harder?

Hourglass network

: BN-ReLU-Conv1x1-BN-ReLU-Conv3x3-BN-ReLU-Conv1x1 4⇥4

Adversarial

low

16 ⇥ 16

0.7

1.0

-40

1.3

Scaling

Rotating

al

al on iti nd Co

AAADEnicfVLLbtNAFB2bVwmPprBkMyJCSlCJ7C6gUjeVKAoLJIpE2koZyxqPb5JRPWMzM0aNRv4GNvwKGxYgxJYVO/6GseNCSCuuZOn4nHvOXN9xUmRcmyD45flXrl67fmPjZufW7Tt3N7tb9450XioGY5ZnuTpJqIaMSxgbbjI4KRRQkWRwnJw+r/Xj96A0z+VbsyggEnQm+ZQzahwVb3l9ksCMSwvvyoZ6XHWIoGexJWYOhsZ2VFWY7GFSytTlgLFONvNkas8qorkgrwXMaNWyiX1RuYA/vZjoMtGGslOXR8vYqsZkyYgKQV0wwXgpzJfCqP83frsd4WDgRlg7ocaMZvZVNTnonyeseAfb+PxlUQ0i/ARfZlH/sXQIyHRlLXG3FwyDpvBFELagh9o6jLs/SZqzUoA0LKNaT8KgMJGlynCWQb0lDYXbDJ3BxEFJBejINlda4UeOSfE0V+6RBjfsqsNSofVCJK6znlmvazV5mTYpzXQ3slwWpQHJlgdNywybHNf/B065AmayhQOUKe5mxWxOFWXG3WjHLSFc/+SL4GhnGAbD8M1Ob/9pu44N9AA9RH0UomdoH71Eh2iMmPfB++R98b76H/3P/jf/+7LV91rPffRP+T9+A4AGAHE=

E

⌧r ⇠ ⌧h ⇠G(x,✓D )

❖ ❖

L[D(⌧h (x), y)] “hard” aug.

min E ✓D

L[D(⌧r (x), y)]

x⇠⌦ ⌧h ⇠G(x,✓D )

“random” aug.

(c) Adversarial data augmentation:

0.7

1.0

1.3

Scale distribution

❖ ❖

Enhance the training effect without looking for more labeled data. Plug-and-play for general target networks, e.g. image classification, segmentation. Jointly optimization without stop-and-retraining.

: BN-ReLU-Conv1x1-BN-ReLU-Conv3x3 : BN-ReLU-Conv1x1-Pooling or Upsampling

Augmentation Network

Dense Block

0

Rotation

60

10 epoch

1

13

⇥10

100 epoch

1

13

⇥10

11

9

9

9

7

7

3

13

17

7

3

10⇥10

6

9 7

3

13

17

⇥10

4

3

10 ⇥10

6

9 -60

-45

0

Rotation

45

60

7

3

8

11

15

3

8

11

15

13

⇥10

200 epoch

1

-60

-45

0

Rotation

45

60

4

-60

-45

0

Rotation

45

60

(c) Comparison of [email protected] on LSP:

Evaluate the generation quality + learn from “hard” data augmentations.

4x4

8⇥8

11

19⇥10

32 x 32

Chu et al. HGs (8) + Ours

Head 98.1 98.2 98.6

Elbow 89.3 91.2 92.8

Wrist 86.9 87.2 90.0

Hip 93.4 93.5 94.8

Knee 94.0 94.5 95.3

Ankel 92.5 92.6 94.5

Mean 92.6 93.0 94.5

64 x 64

HGs (8)

Mixed Gaussians

Main Contributions: ❖

128C

11

13

“hard” aug.

Up Scaling 16 x 16 8x8

13

⇥10

19⇥10

L[D(⌧h (x), y)]

at aF

lo w

E

D

Jointly optimize Data Augmentation and Network Training. More effective data augmentation more effective training.

Or

(b) Visualization of training status:

-60

Target Network

Rotating Occluding “Hard” Data Augmentation

Augmentation Network

(b) Discrimination path (target network): AAAC0HicbVFbb9MwFHbCbYRbgUdeLCqkFk1VsgeY2MskiuABxEB0m1RHkeOetNZsJ4sdtMpYiFd+Hm/8Av4GbpZppeNIlr7znfOdm/NKcG3i+HcQXrt+4+atrdvRnbv37j/oPXx0qMumZjBhpSjr45xqEFzBxHAj4LiqgcpcwFF+8noVP/oKteal+mKWFaSSzhUvOKPGU1nvD8lhzpWF06ZlnruICJqD8MyrsXckV5klZgGGZnbsHCZ7mDRq5muCsURSs8gLe+aI5pJ8lDCnrmNz+6ZL34suFZgY2mR20Qrs28Flhe2uy3jo8GaNqHUZFfa9m44HFzXW1MNtfOEs3TCNCKjZ2lZZrx+P4tbwVZB0oI86O8h6v8isZI0EZZigWk+TuDKppbXhTIA/TKOhouyEzmHqoaISdGrbD3H4mWdmuChr/5TBLbuusFRqvZS5z1zNrDdjK/J/sWljit3UclU1BhQ7b1Q0ApsSr34Xz3gNzIilB5TV3M+K2YLWlBl//cgfIdlc+So43Bkl8Sj5tNPff9GdYws9QU/RACXoJdpH79ABmiAWfAh08C1w4efwLPwe/jhPDYNO8xj9Y+HPv9th5Ms=

Scaling

Occluding

Generate more difficult data augmentations than random.

Generation

Training Data

40

Distribution

x⇠⌦

✓G

Co nd iti on

0

32 ⇥ 32 64 ⇥ 64

(a) Generation path (augmentation network):

Our approach:

Competitive

Residual Block

⇠ ⇠

Half hourglass network

Reward/Penalty

max E

Agent

: Pooling or Upsampling 16 ⇥ 16 32 ⇥ 32 64 ⇥ 64

128C

HG Loss

❖

Same augmentation strategy for all data individual difference. Data augmentation can NOT follow network training isolated. Ineffective data augmentation ineffective training.

8⇥8

Target Network

HG Loss

❖

Da ta F

Scaling Rotating Occluding Random Data Augmentation ❖

Loss/ PCKh

Target Network

Training Data

256C

+ Ours -40

0

40

Rotation distribution

Mixed Gaussians

Scaling and rotating (mixed Gaussian)

Pose Network

Hierarchical occluding (scaled-up masks)

[HGs] Newell et. al “Stacked hourglass networks for human pose estimation.” In ECCV 2016. [HPG] Wang et. al “A-fast-rcnn: hard positive generation via adversary for object detection.” In CVPR 2017.

Jointly Optimize Data Augmentation and Network

Jointly Optimize Data Augmentation and Network

Suggest Documents

Jointly Optimize Data Augmentation and Network

Jointly optimize high-speed train route, timetable and ...

Random Erasing Data Augmentation

Data Preprocessing and Augmentation for Multiple ...

DATA AUGMENTATION AND LANGUAGE MODEL ADAPTATION D ...

Test data augmentation - CREST UCL

Acoustic and Textual Data Augmentation for Improved

Data Augmentation and Transfer Learning for Limited

Smart Augmentation Learning an Optimal Data Augmentation ... - arXiv

Universal Augmentation Schemes for Network ... - Semantic Scholar

Using network calculus to optimize the AFDX network - Core

Jointly Optimizing Data Acquisition and Delivery in Traffic Monitoring ...

Jointly Broadcasting Data and Power with Quality of Service ... - arXiv

Jointly Learning Data-Dependent Label and ... - Semantic Scholar

Feedback Linearization with Neural Network Augmentation - CiteSeerX

Jointly Learning Data-Dependent Label and Locality ... - Google Sites

Jointly Learning Data-Dependent Label and ... - Semantic Scholar

Data Augmentation Data Cleaning Data Cleansing ... - Springer Link

optimize optimize - 1E.com

Augmentation mastopexy and augmentation mammoplasty: an ...

Calibrated Data Augmentation for Scalable Markov ...

Orthogonal Data Augmentation for Bayesian Model

Data augmentation instead of explicit regularization

Improving Deep Learning using Generic Data Augmentation