Fusing the Global and Local Deep Representation for

Fusing the Global and Local Deep Representation for Effective Object Retreival Wang Mao public code: https://github.com/wangmaoCS/Fusion_RMAC

Contents • Introduction • Motivation • Method & Results • Conclusion

1, Introduction : image retrieval

Search

Query Image

Database Images

Results

Object retrieval: there is a rectangle to indicate the location of the query object.

1，Introduction: MAC feature

H

CNN K

Input Image

Alexnet, VGG-net, ...

Max

W

convolutional activations

Maximum Activations of Convolutions (MAC) : • simple to extract • low dimension, K=256(Alexnet), 512(VGG-16), 1024(Resnet)

K-dim MAC

1，Introduction: Regional MAC (R-MAC) N

1 H K

W

convolutional activations

extracting MAC from sub-regions (N sub-regions, 2+6+12)

...

sum pooling L2 normalization

N K-dim MAC features

R-MAC ：1, information from multi-scale regions 2, low dimension 3, easy to extract

K-dim R-MAC

1，R-MAC based object retrieval: pipeline Step 3: Query Expansion

Step 1: Initial Retrieval

q

Query Image

X

Search

+

qTX

Database Images

Query Initial Results

top-5 results q1

rerank Step 2: Reranking by Localization

Search

q1TX

Localization

Searching on feature map

Top 1k Results

Rerank Results

Final Results

2，Motivation ECCV 2016

Existing work: deep learning for R-MAC representation

SiaMAC

ICLR 2015

ECCV 2016 Deep Retrieval

R-MAC CVPR 2016 Application: Similarity Search, Manifold Diffusion

IJCV 2017

Limitation: 1, target at particular datasets 2, need annotated data & GPUs

Our purpose: 1, optimize the retrieval pipeline, not R-MAC 2, utilize public deep models, without training

2, Motivation: the problem of R-MAC(1/3)

Query Object

Query Image

Query R-MAC: from query object

Database Image

Database R-MAC : from whole image

Asymmetric Comparison: under-estimated similarity in (1) Initial retrieval

2, Motivation: the problem of R-MAC(2/3) Query

The object localization result cropped from the ICLR 2015 paper (R-MAC)

unreliable localization Our results: bad (2) reranking & (3) query expansion

2, Motivation: the problem of R-MAC(3/3)

Query

False result

Only relying local information leads to false results

3, Method: Initial retrieval with global R-MAC Step 1: Initial Retrieval

Query Object

Query Image

Database Image

Query R-MAC: from query object

Database R-MAC : from whole image

Asymmetric Comparasion Query R-MAC: from query image

Symmetric Comparasion

3, Method: Reranking with fusing R-MACs Step 2: Reranking by Localization (R) R-MAC based retrieval：only using local information × Query Object(RMAC_qL) Located Object(RMAC_dL)

Database Image(RMAC_dG)

Query Image(RMAC_qG) [RMAC_qL ; RMAC_qG]

×

The proposed approach: fusing the local and global R-MAC

[RMAC_dL; RMAC_dG]

3, Method: QE with fusing R-MACs Step 3: Query Expansion (QE)

+ Query

top-5 results

QE in R-MAC based approach: 1, merging: q1 = Mean(q + top-5 results) 2, querying: retrieval in the top-1k results

The proposed approach:

retrieval

Search

q1TX reranked Results(1k)

local+global representation

local representation

Concatenating local and global R-MAC feature: Final Results

RMAC_L

->

[RMAC_L ;RMAC_G]

3, Method: Analysis • The proposed approach:

• step 1, Initial retrieval : local R-MAC -> global R-MAC • step 2, Reranking: local R-MAC -> (local + global) R-MAC • step 3, Query expansion: local R-MAC -> (local + global) R-MAC

• Extra computation:

• extracting global R-MAC for the query (less than 1s)

• Extra memory

• concatenation of local and global R-MAC

• reranking on top-1k results • VGG-16 deep net: 512d R-MAC -> 1024d R-MAC, • 512*1000*4B = 2MB

3, Results: experimental setup • Public datasets • • • • •

Oxford5k: 5063 images, 55 queries, 11 landmarks of Oxford Univ Paris6k: 6392 images, 55 queries, 11 landmarks of Paris Flickr 100k: 100k distractor images Oxford105k: Oxford5k + Flickr100k Paris106k: Paris6k + Flickr100k

• Comparison

• ICLR 2015: R-MAC (VGG-16 deep model) • ECCV2016: Sia-RMAC (Fine-tuned VGG-16)

• Evaluation

• mean Average Precision (mAP)

• the area under the precision-recall curve, (0,1] • the more, the better

3, Results: Initial retrieval Replacing the local query R-MAC by the global: improve initial accuracy

Retrieval accuracy (mAP%) on the Oxford, Paris dataset (ICLR2015, VGG-16)

Retrieval accuracy (mAP%) on the Oxford, Paris dataset (ECCV2016, siaMAC)

3, Results: Reranking & Query Expansion The fusion of local and global R-MAC: more comprehensive representation

Retrieval accuracy (mAP%) on the Oxford, Paris dataset (ICLR2015, VGG-16)

Retrieval accuracy (mAP%) on the Oxford, Paris dataset (ECCV2016, siaMAC)

Conclusion • Revised R-MAC

• step 1, Initial retrieval: local R-MAC vs global R-MAC • step 2, Reranking: local R-MAC vs (local+global) R-MAC • step 3, Query expansion: local R-MAC vs (local+global) R-MAC

• Pros

• retrieval accuracy improvement • low extra computation and memory cost

• Cons

• suit for landmark image retrieval, not for generic object image retrieval

Fusing the Global and Local Deep Representation for

Fusing the Global and Local Deep Representation for

Suggest Documents

DEEP SYMBOLIC REPRESENTATION LEARNING FOR ...

Fusing Results of Several Deep Learning

THE REPRESENTATION AND PROCESSING OF ... - Deep Blue

THE REPRESENTATION AND PROCESSING OF ... - Deep Blue

Deep Hashing Based Fusing Index Method for Large-Scale Image

Detecting Unexpected Obstacles for Self-Driving Cars: Fusing Deep ...

Deep Unsupervised Representation Learning for ...

Deep Attentional Structured Representation Learning for Visual ...

Representation of Global and National

local and global

Local Multiresolution Representation for 6D Motion ... - CiteSeerX

Clothing, Kinship, and Representation: Transnational ... - Deep Blue

PATCHING AND LOCAL-GLOBAL PRINCIPLES FOR ...

Local and Global Approximations for Incomplete Data

GLOBAL AND LOCAL REACTIVITY DESCRIPTORS FOR PICLORAM ...

Appearance Global and Local Structure Fusion for

CLASSIFYING LOCAL QUERIES FOR GLOBAL

A Deep Representation for Invariance And Music Classification - arXiv

Learning and Transferring Multi-task Deep Representation for Face ...

Deep Spatiotemporal Representation of the Face for Automatic

An Image-based Deep Spectrum Feature Representation for the

Political Representation and Empowerment: Women in Local ...

National Minorities' Inclusion and Representation in Local ...

REPRESENTATION PROCEEDINGS UNDER THE ... - SEIU Local 668