Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ALADDIN A Locality Aligned Deep Model for Instance Search Wenhui Jiang, Zhicheng Zhao, Fei Su, Anni Cai Beijing University of Posts and Telecommunications Introduction • Instance search means query query Challenges • Asymmetrical Similarity Instance search Similar image search / Content-based image search Challenges • Asymmetrical Similarity • Robust feature representation Related Work • Bag of visual words (Sivic et al, ICCV 2003 ) • Vocabulary tree (Nister et al, CVPR 2006) • Hamming Embedding (Jegou et al, ECCV 2008 ) • Bag of boundaries (Arandjelovic et al, ICCV 2011) • Randomized Visual Phrases (Jiang et al, CVPR 2012, TIP 2015) • Point-indexing (Tao et al, CVPR 2014) Local features based systems • • Hand-crafted features Incapable of describing small or smooth objects Related Work • Graph Fusion (Zhang et al, ECCV 2012) • Co-indexing (Zhang et al, ICCV 2013, PAMI 2015) • Query-Adaptive Fusion (Zheng et al, CVPR 2015) • Generic Attributes and Categories (Tao et al, CVPR 2015) Combining different features (local & global) • DeCAF: category-level distinction only Deep only systems should solve … • Similarity measurement Discount for background clutter • CNN architecture Capture instance-level distinction • Collect large scale training data No labelled training data for instance search benchmarks Architecture overview Architecture overview Step 1: Decompose a dataset image D into a small set of candidate regions • Locality aligned Architecture overview Step 2: Given pre-trained CNN model, extract feature vectors for regions. • Instance-level distinction Architecture overview Step 3: Encode feature vectors using PQ, organized with inverted file system. • Highly efficient Architecture overview Step 1: Given pre-trained CNN model, extract feature vectors for regions. Step 2: Search and rank in IFS CNN Network Triplets as inputs •Query •Positive •Negative The training goal is to keep patches from the same object to be closer than those of different objects by a large margin. CNN Network Collect training data • Hard negative mining manner Collect training data • Collect K most likely object regions according to objectness score as seeds. • For each seed region •Search for the best matching region based on DeCAF feature •Top 50 returned regions are regarded as positive regions •Verify the correctness of the candidates using BoF + RANSAC , return verified positive set and negative set •Divide positive set into two halves, one as queries and the other one as positive •Generate a set of triplets Experiments Experiments SIFT-like features Deep feature is generic. Experiments DeCAF on entire image Locality aligned + DeCAF Locality aligned scheme is important ! Experiments Instance-level distinction is important ! Experiments Ranking-based loss trained on entire images Ranking-based loss trained on patches Learning instance-level distinction on patches is easier. Experiments The best performing deep-only instance search system. Experiments Conclusions • A deep ONLY system for instance retrieval • Region proposal for problem decomposition • Capture both category-level and instance-level distinction • Automatic training data collection Thank you for your attention ! Wenhui Jiang, Zhicheng Zhao, Fei Su, Anni Cai. “ALADDIN: A Locality Aligned Deep Model for Instance Search.” ICASSP 2016.