Download aladdin - SigPort

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Binary search algorithm wikipedia , lookup

Pattern recognition wikipedia , lookup

Transcript
ALADDIN
A Locality Aligned Deep Model for Instance Search
Wenhui Jiang, Zhicheng Zhao, Fei Su, Anni Cai
Beijing University of Posts and Telecommunications
Introduction
• Instance search means
query
query
Challenges
• Asymmetrical Similarity
Instance search
Similar image search /
Content-based image search
Challenges
• Asymmetrical Similarity
• Robust feature representation
Related Work
• Bag of visual words (Sivic et al, ICCV 2003 )
• Vocabulary tree (Nister et al, CVPR 2006)
• Hamming Embedding (Jegou et al, ECCV 2008 )
• Bag of boundaries (Arandjelovic et al, ICCV 2011)
• Randomized Visual Phrases (Jiang et al, CVPR 2012, TIP 2015)
• Point-indexing (Tao et al, CVPR 2014)
Local features based systems
•
•
Hand-crafted features
Incapable of describing small or smooth objects
Related Work
• Graph Fusion (Zhang et al, ECCV 2012)
• Co-indexing (Zhang et al, ICCV 2013, PAMI 2015)
• Query-Adaptive Fusion (Zheng et al, CVPR 2015)
• Generic Attributes and Categories (Tao et al, CVPR 2015)
Combining different features (local & global)
•
DeCAF: category-level distinction only
Deep only systems should solve …
• Similarity measurement
Discount for background clutter
• CNN architecture
Capture instance-level distinction
• Collect large scale training data
No labelled training data for instance search benchmarks
Architecture overview
Architecture overview
Step 1: Decompose a dataset image D into a small set of candidate regions
• Locality aligned
Architecture overview
Step 2: Given pre-trained CNN model, extract feature vectors for regions.
• Instance-level distinction
Architecture overview
Step 3: Encode feature vectors using PQ, organized with inverted file system.
• Highly efficient
Architecture overview
Step 1: Given pre-trained CNN model, extract feature vectors for regions.
Step 2: Search and rank in IFS
CNN Network
Triplets as inputs
•Query
•Positive
•Negative
The training goal is to keep patches from the same object to be closer than those of different
objects by a large margin.
CNN Network
Collect training data
• Hard negative mining manner
Collect training data
• Collect K most likely object regions according to objectness score as seeds.
• For each seed region
•Search for the best matching region based on DeCAF feature
•Top 50 returned regions are regarded as positive regions
•Verify the correctness of the candidates using BoF + RANSAC , return verified
positive set and negative set
•Divide positive set into two halves, one as queries and the other one as positive
•Generate a set of triplets
Experiments
Experiments
SIFT-like
features
Deep feature is generic.
Experiments
DeCAF on entire
image
Locality aligned
+
DeCAF
Locality aligned scheme is important !
Experiments
Instance-level distinction is important !
Experiments
Ranking-based
loss trained on
entire images
Ranking-based
loss trained on
patches
Learning instance-level distinction on patches is easier.
Experiments
The best performing deep-only instance search system.
Experiments
Conclusions
• A deep ONLY system for instance retrieval
• Region proposal for problem decomposition
• Capture both category-level and instance-level distinction
• Automatic training data collection
Thank you for your attention !
Wenhui Jiang, Zhicheng Zhao, Fei Su, Anni Cai. “ALADDIN: A Locality Aligned Deep Model for Instance Search.” ICASSP 2016.