Jointly Optimize Data Augmentation and Network

4 downloads 0 Views 3MB Size Report
Adversarial Data Augmentation in Human Pose Estimation. Xi Peng*† .... [HGs] Newell et. al “Stacked hourglass networks for human pose estimation.” In ECCV ...
Jointly Optimize Data Augmentation and Network Training Adversarial Data Augmentation in Human Pose Estimation Xi Peng*†, Zhiqiang Tang*†, Fei Yang §, Rogerio Feris ‡, Dimitris Metaxas † Rutgers, The State University of New Jersey † Facebook § IBM T.J. Watson Research Center ‡ * Contributed equally, project page: https://sites.google.com/site/xipengcshomepage/cvpr2018

Highlights

Motivation:

Approach

Results (a) Network architecture

Target Network

Random

Loss/ PCKh

Random data augmentation

256C

Harder?

Hourglass network

: BN-ReLU-Conv1x1-BN-ReLU-Conv3x3-BN-ReLU-Conv1x1 4⇥4

Adversarial

low

16 ⇥ 16

0.7

1.0

-40

1.3

Scaling

Rotating

al

al on iti nd Co

AAADEnicfVLLbtNAFB2bVwmPprBkMyJCSlCJ7C6gUjeVKAoLJIpE2koZyxqPb5JRPWMzM0aNRv4GNvwKGxYgxJYVO/6GseNCSCuuZOn4nHvOXN9xUmRcmyD45flXrl67fmPjZufW7Tt3N7tb9450XioGY5ZnuTpJqIaMSxgbbjI4KRRQkWRwnJw+r/Xj96A0z+VbsyggEnQm+ZQzahwVb3l9ksCMSwvvyoZ6XHWIoGexJWYOhsZ2VFWY7GFSytTlgLFONvNkas8qorkgrwXMaNWyiX1RuYA/vZjoMtGGslOXR8vYqsZkyYgKQV0wwXgpzJfCqP83frsd4WDgRlg7ocaMZvZVNTnonyeseAfb+PxlUQ0i/ARfZlH/sXQIyHRlLXG3FwyDpvBFELagh9o6jLs/SZqzUoA0LKNaT8KgMJGlynCWQb0lDYXbDJ3BxEFJBejINlda4UeOSfE0V+6RBjfsqsNSofVCJK6znlmvazV5mTYpzXQ3slwWpQHJlgdNywybHNf/B065AmayhQOUKe5mxWxOFWXG3WjHLSFc/+SL4GhnGAbD8M1Ob/9pu44N9AA9RH0UomdoH71Eh2iMmPfB++R98b76H/3P/jf/+7LV91rPffRP+T9+A4AGAHE=

E

⌧r ⇠ ⌧h ⇠G(x,✓D )

❖ ❖

L[D(⌧h (x), y)] “hard” aug.

min E ✓D

L[D(⌧r (x), y)]

x⇠⌦ ⌧h ⇠G(x,✓D )

“random” aug.

(c) Adversarial data augmentation:

0.7

1.0

1.3

Scale distribution

❖ ❖

Enhance the training effect without looking for more labeled data. Plug-and-play for general target networks, e.g. image classification, segmentation. Jointly optimization without stop-and-retraining.

: BN-ReLU-Conv1x1-BN-ReLU-Conv3x3 : BN-ReLU-Conv1x1-Pooling or Upsampling

Augmentation Network

Dense Block

0

Rotation

60

10 epoch

1

13

⇥10

100 epoch

1

13

⇥10

11

9

9

9

7

7

3

13

17

7

3

10⇥10

6

9 7

3

13

17

⇥10

4

3

10 ⇥10

6

9 -60

-45

0

Rotation

45

60

7

3

8

11

15

3

8

11

15

13

⇥10

200 epoch

1

-60

-45

0

Rotation

45

60

4

-60

-45

0

Rotation

45

60

(c) Comparison of [email protected] on LSP:

Evaluate the generation quality + learn from “hard” data augmentations.

4x4

8⇥8

11

19⇥10

32 x 32

Chu et al. HGs (8) + Ours

Head 98.1 98.2 98.6

Elbow 89.3 91.2 92.8

Wrist 86.9 87.2 90.0

Hip 93.4 93.5 94.8

Knee 94.0 94.5 95.3

Ankel 92.5 92.6 94.5

Mean 92.6 93.0 94.5

64 x 64

HGs (8)

Mixed Gaussians

Main Contributions: ❖

128C

11

13

“hard” aug.

Up Scaling 16 x 16 8x8

13

⇥10

19⇥10

L[D(⌧h (x), y)]

at aF

lo w

E

D

Jointly optimize Data Augmentation and Network Training. More effective data augmentation more effective training.

Or

(b) Visualization of training status:

-60

Target Network

Rotating Occluding “Hard” Data Augmentation

Augmentation Network

(b) Discrimination path (target network): AAAC0HicbVFbb9MwFHbCbYRbgUdeLCqkFk1VsgeY2MskiuABxEB0m1RHkeOetNZsJ4sdtMpYiFd+Hm/8Av4GbpZppeNIlr7znfOdm/NKcG3i+HcQXrt+4+atrdvRnbv37j/oPXx0qMumZjBhpSjr45xqEFzBxHAj4LiqgcpcwFF+8noVP/oKteal+mKWFaSSzhUvOKPGU1nvD8lhzpWF06ZlnruICJqD8MyrsXckV5klZgGGZnbsHCZ7mDRq5muCsURSs8gLe+aI5pJ8lDCnrmNz+6ZL34suFZgY2mR20Qrs28Flhe2uy3jo8GaNqHUZFfa9m44HFzXW1MNtfOEs3TCNCKjZ2lZZrx+P4tbwVZB0oI86O8h6v8isZI0EZZigWk+TuDKppbXhTIA/TKOhouyEzmHqoaISdGrbD3H4mWdmuChr/5TBLbuusFRqvZS5z1zNrDdjK/J/sWljit3UclU1BhQ7b1Q0ApsSr34Xz3gNzIilB5TV3M+K2YLWlBl//cgfIdlc+So43Bkl8Sj5tNPff9GdYws9QU/RACXoJdpH79ABmiAWfAh08C1w4efwLPwe/jhPDYNO8xj9Y+HPv9th5Ms=

Scaling

Occluding

Generate more difficult data augmentations than random.

Generation

Training Data

40

Distribution

x⇠⌦

✓G

Co nd iti on

0

32 ⇥ 32 64 ⇥ 64

(a) Generation path (augmentation network):

Our approach:

Competitive

Residual Block

⇠ ⇠

Half hourglass network

Reward/Penalty

max E

Agent

: Pooling or Upsampling 16 ⇥ 16 32 ⇥ 32 64 ⇥ 64

128C

HG Loss



Same augmentation strategy for all data individual difference. Data augmentation can NOT follow network training isolated. Ineffective data augmentation ineffective training.

8⇥8

Target Network

HG Loss



Da ta F

Scaling Rotating Occluding Random Data Augmentation ❖

Loss/ PCKh

Target Network

Training Data

256C

+ Ours -40

0

40

Rotation distribution

Mixed Gaussians

Scaling and rotating (mixed Gaussian)

Pose Network

Hierarchical occluding (scaled-up masks)

[HGs] Newell et. al “Stacked hourglass networks for human pose estimation.” In ECCV 2016. [HPG] Wang et. al “A-fast-rcnn: hard positive generation via adversary for object detection.” In CVPR 2017.