neighbourhood consensus networks · neighbourhood consensus networks introduction proposed method...

1
Neighbourhood Consensus Networks Introduction Proposed method Ignacio Rocco 1,2 Josef Sivic 1,2,3 Mircea Cimpoi 3 Akihiko Torii 5 Tomas Pajdla 3 1 DI ENS, PSL Research University 3 CIIRC, CTU in Prague 4 DeepMind 2 INRIA 5 Tokyo Institute of Technology Feature extraction and matching Trainable match ltering 4-D conv. layers 4D space of raw match scores Neighbourhood Consensus Network 4D ltered match scores Soft mutual nearest-neighbours ltering Neighbourhood Consensus Network is a 4D CNN The lters of the rst layer span Task: nd pixel-level visual correspondences where Extracting correspondences: Experimental results Contributions: - Trainable method for feature extraction, matching and ltering - Based on dense CNN descriptors and a novel 4D neighbourhood consensus CNN - Trainable without correspondence annotation Challenges: strong illumination or appearance changes day-night matching changes across time repetitive structures intra-class variation ratio-to-max along image A ratio-to-max along image B Very few correct matches: Correct matches: Total matches: Matches can be extracted in both directions from the output : Training loss: The network is trained with weak supervision: The network is applied twice for invariance to pair order: Normalized scores over : matching Normalized scores over : Category-level matching: PF-Pascal dataset [1] - Task: match similar semantic parts - Metric: percentage of correct keypoints (PCK) Method PCK HOG+PF-LOM [1] 62.5 SCNet-AG+ [2] 72.2 CNNGeo [3] 71.9 WeakAlign [4] 75.8 NC-Net 78.9 References [1] B. Ham, M. Cho, C. Schmid, and J. Ponce. Proposal ow: Semantic correspondences from object proposals. IEEE PAMI, 2017. [2] K. Han, R. S. Rezende, B. Ham, K.-Y. K. Wong, M. Cho, C. Schmid, and J. Ponce. SCNet: Learning Semantic Correspondence. In Proc. ICCV, 2017. [3] I. Rocco, R. Arandjelović, and J. Sivic. Convolutional neural network architecture for geometric matching. In Proc. CVPR, 2017. [4] I. Rocco, R. Arandjelović, and J. Sivic. End-to-end weakly-supervised semantic alignment. In Proc. CVPR, 2018. [5] H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii. InLoc: Indoor visual localization with dense matching and view synthesis. In Proc. CVPR, 2018. - Handcrafted sparse keypoints - Trainable dense features - VGG/ResNet (e.g. conv4) dense descriptors sparse descriptors - Ratio test (1st NN/2nd NN) - Nearest neighbours - Mutual nearest-neighbours - Global methods - Semi-local methods homography tting using RANSAC Neighbourhood consensus match being analyzed neighbouring matches x Tentative matching Match ltering Input images Feature extraction x Review - classical pipeline: matching Instance-level matching: InLoc dataset [5] - Task: retrieve 6-dof camera poses of query images - Metric: percentage of correctly localized queries 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 0 20 40 60 80 Distance threshold [meters] Correctly localized queries [%] InLoc + NCNet InLoc + MNN InLoc [5] DensePE + NCNet DensePE + MNN DensePE [5] SparsePE [5] Ours Ground-truth Ours Ground-truth 0.36m, 1.93 8.11m, 48.38 0.41m, 0.83 6.65m, 23.86 0.51m, 0.99 9.36m, 23.72 Ours (InLoc+NCNet) Baseline (InLoc) - positive pairs ( ): maximize match score - negative pairs ( ): minimize match score score distributions tend to Kronecker delta score distributions tend to uniform positive matches: negative matches: mean score of matches mean score of matches

Upload: others

Post on 25-Jul-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neighbourhood Consensus Networks · Neighbourhood Consensus Networks Introduction Proposed method Ignacio Rocco1,2 Mircea Cimpoi3 Akihiko Torii5 Tomas Pajdla3 Josef Sivic1,2,3 1DI

Neighbourhood Consensus Networks

Introduction Proposed method

Ignacio Rocco1,2 Josef Sivic1,2,3Mircea Cimpoi3 Akihiko Torii5 Tomas Pajdla3

1DI ENS, PSL Research University 3CIIRC, CTU in Prague 4DeepMind2INRIA 5Tokyo Institute of Technology

Feature extraction and matching Trainable match filtering

4-D conv. layers

4D space of raw match scores

Neighbourhood Consensus Network

4D filteredmatch scores

Soft mutual nearest-neighbours filtering Neighbourhood Consensus Network

is a 4D CNN

The filters of the first layer span

Task: find pixel-level visual correspondences

where

Extracting correspondences:

Experimental results

Contributions: - Trainable method for feature extraction, matching and filtering

- Based on dense CNN descriptors and a novel 4D neighbourhood consensus CNN

- Trainable without correspondence annotation

Challenges: strong illumination or appearance changes

day-night

matching

changes

across time

repetitive

structures

intra-class

variation

ratio-to-maxalong image A

ratio-to-maxalong image B

Very few correct matches:

Correct matches: Total matches:

Matches can be extracted in both directions from the output :

Training loss:

The network is trained with weak supervision:

The network is applied twice for invariance to pair order:

Normalized scores over :matching Normalized scores over :

Category-level matching: PF-Pascal dataset [1]

- Task: match similar semantic parts - Metric: percentage of correct keypoints (PCK)

Method PCK

HOG+PF-LOM [1] 62.5SCNet-AG+ [2] 72.2CNNGeo [3] 71.9WeakAlign [4] 75.8NC-Net 78.9

References

[1] B. Ham, M. Cho, C. Schmid, and J. Ponce. Proposal flow: Semantic correspondences from object proposals. IEEE PAMI, 2017.[2] K. Han, R. S. Rezende, B. Ham, K.-Y. K. Wong, M. Cho, C. Schmid, and J. Ponce. SCNet: Learning Semantic Correspondence. In Proc. ICCV, 2017.[3] I. Rocco, R. Arandjelović, and J. Sivic. Convolutional neural network architecture for geometric matching. In Proc. CVPR, 2017.[4] I. Rocco, R. Arandjelović, and J. Sivic. End-to-end weakly-supervised semantic alignment. In Proc. CVPR, 2018.[5] H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii. InLoc: Indoor visual localization with dense matching and view synthesis. In

Proc. CVPR, 2018.

- Handcrafted sparse keypoints

- Trainable dense features - VGG/ResNet (e.g. conv4)

dense descriptors

sparsedescriptors

- Ratio test (1st NN/2nd NN)

- Nearest neighbours

- Mutual nearest-neighbours

- Global methods

- Semi-local methods

homography fitting using RANSAC

Neighbourhood consensus

match being analyzedneighbouring matches

x

Tentativematching

Matchfiltering

Inputimages

Featureextraction

x

Review - classical pipeline:

matching

Instance-level matching: InLoc dataset [5]

- Task: retrieve 6-dof camera poses of query images - Metric: percentage of correctly localized queries

0 0.25 0.5 0.75 1 1.25 1.5 1.75 20

20

40

60

80

Distance threshold [meters]

Corr

ectly

loca

lized

quer

ies

[%]

InLoc + NCNetInLoc + MNNInLoc [5]DensePE + NCNetDensePE + MNNDensePE [5]SparsePE [5]

Ours Ground-truth Ours Ground-truth

0.36m, 1.93◦ 8.11m, 48.38◦

0.41m, 0.83◦ 6.65m, 23.86◦

0.51m, 0.99◦ 9.36m, 23.72◦

Ours (InLoc+NCNet) Baseline (InLoc)

- positive pairs ( ): maximize match score - negative pairs ( ): minimize match score

score distributions

tend to Kronecker delta

score distributionstend to uniform

positive matches:

negative matches:

mean score of matches

mean score of matches