Patch tracking based on comparing its pixels 1

Patch tracking based on comparing its piels 1 Tomáš Svoboda, svoboda@cmp.felk.cvut.cz Czech Technical University in Prague, Center for Machine Perception http://cmp.felk.cvut.cz Last update: March 23, 2015 Talk Outline comparing patch piels normalized cross-correlation, ssd... KLT - gradient based optimization good features to track 1 Please note that the lecture will be accompanied be several sketches and derivations on the blackboard and few live-interactive demos in Matlab What is the problem? 2/38 : CTU campus, door of G building Tracking of dense sequences camera motion T - Template I - Image 3/38 Scene static, camera moves.

Tracking of dense sequences object motion T - Template I - Image 4/38 Camera static, object moves. Alignment of an image (patch) 5/38 Goal is to align a template image T () to an input image I(). column vector containing image coordinates [, y]. The I() could be also a small subwindow within an image. How to measure the alignment? What is the best criterial function? 6/38 How to find the best match, in other words, how to find etremum of the criterial function? Criterial function What are the desired properties (on a certain domain)? conve (remember the optimization course?) discriminative...

Normalized cross-correlation 7/38 You may know it as correlation coefficient (from statistics) ρx,y = cov(x, Y ) E[(X µx )(Y µy )] = σx σy σx σy where σ means standard deviation. Having template T (k, l) and image I(, y), T (k, l) T I( + k, y + l) I(, y) r r(, y) = q 2 2 P P P P T I(, y) T (k, l) I( + k, y + l) k l k l P P k l Normalized cross-correlation in picture 8/38 criterial function ncc 1 1 0.6 0.4 2 2 3 3 4 4 0 0 5 5 0 0.2 0.4 0.6 0 well, definitely not conve 0.2 but the discriminability looks promising 0.8 very efficient in computation, see [3]2. 2 0 check also normcorr2 in Matlab Sum of squared differences ssd (, y) = XX k l 9/38 2 (T (k, l) I( + k, y + l)) criterial function ssd 1 1 2 2 3 3 4 4 0 0 5 5 7 10 4 3.5 3 2.5 0 2 1.5 1 0.5 0 0

Sum of absolute differences sad (, y) = XX k l 10/38 T (k, l) I( + k, y + l) criterial function sad 5 10 2.2 1.8 1 1 1.6 2 2 3 3 4 4 0 0 5 5 2 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 SAD for the door part 11/38 criterial function sad 4 10 10 9 8 1 7 1 6 2 5 2 4 3 3 3 2 4 0 1 5 0 2 3 4 0 5 SAD for the door part truncated 12/38 criterial function sad_truncated 8000 0 1 0 00 2 0 1 2 0 3 0 3 4 0 0 5 0 2 3 4 0 Differences greater than 20 intensity levels are counted as 20. 5 0

Normalized cross-correlation: how it works 13/38 live demo for various patches Normalized cross-correlation: tracking 14/38 What went wrong? Why did it failed? Suggestions for improvement? Tracking as an optimization problem finding etrema of a criterial function... 15/38... sounds like an optimization problem Kanade Lucas Tomasi (KLT) tracker Iteratively minimizes sum of square differences. It is a Gauss-Newton gradient algorithm.

Importance in Computer Vision Firstly published in 1981 as an image registration method [4]. 16/38 Improved many times, most importantly by Carlo Tomasi [5, 6] Free implementation(s) available3. Also part of the OpenCV library4. After more than two decades, a project5 at CMU dedicated to this single algorithm and results published in a premium journal [1]. Part of plethora computer vision algorithms. Our eplanation follows mainly the paper [1]. It is a good reading for those who are also interested in alternative solutions. 3 http://www.ces.clemson.edu/~stb/klt/ 4 http://opencv.org/ 5 Lucas-Kanade 20 Years On https://www.ri.cmu.edu/research_project_detail.html?project_ id=515&menu_id=261 Original Lucas-Kanade algorithm I Goal is to align a template image T () to an input image I(). column vector containing image coordinates [, y]. The I() could be also a small subwindow withing an image. Set of allowable warps W(; p), where p is a vector of parameters. For translations [ ] + p1 W(; p) = y + p 2 W(; p) can be arbitrarily comple The best alignment, p, minimizes image dissimilarity [I(W(; p)) T ()] 2 17/38 Original Lucas-Kanade algorithm II 18/38 [I(W(; p)) T ()] 2 I(W(; p) is nonlinear! The warp W(; p) may be linear but the piels value are, in general, non-linear. In fact, they are essentially unrelated to. Linearization of the image: It is assumed that some p is known and best increment p is sought. The modified problem [I(W(; p + p)) T ()] 2 is solved with respect to p. When found then p gets updated p p + p...

Original Lucas-Kanade algorithm III 19/38 [I(W(; p + p)) T ()] 2 linearization by performing first order Taylor epansion 6 [I(W(; p)) + p T ()]2 I = [ I, I W y ] is the gradient image computed at W(; p). The term is the Jacobian of the warp. 6 Detailed eplanation on the blackboard. Original Lucas-Kanade algorithm IV Differentiate W [I(W(; p)) + I p T ()]2 with respect to p 20/38 2 [ ] [ ] I(W(; p)) + p T () setting equality to zero yields p = H 1 [ ] [T () I(W(; p))] where H is (Gauss-Newton) approimation of Hessian matri. H = [ ] [ ] Iterate: The Lucas-Kanade algorithm Summary 21/38 1. Warp I with W(; p) 2. Warp the gradient I with W(; p) 3. Evaluate the Jacobian W image 4. Compute the H = 5. Compute p = H 1 [ at (; p) and compute the steepest descent [ ] [ 6. Update the parameters p p + p until p ɛ ] ] [T () I(W(; p))]

Eample of convergence 22/38 Eample of convergence 23/38 Convergence : Initial state is within the basin of attraction Eample of divergence 24/38 Divergence : Initial state is outside the basin of attraction

Eample on-line demo 25/38 Let play and see... What are good features (windows) to track? 26/38 What are good features (windows) to track? How to select good templates T () for image registration, object tracking. 27/38 p = H 1 [ ] [T () I(W(; p))] where H is the matri H = [ ] [ ] The stability of the iteration is mainly influenced by the inverse of Hessian. We can study its eigenvalues. Consequently, the criterion of a good feature window is min(λ 1, λ 2 ) > λ min (teturedness).

What are good features for translations? [ + p1 Consider translation W(; p) = y + p [ ] 2 W 1 0 = 0 1 [ H = = [ 1 0 0 1 = ( I I I y ] [ ] [ I I y ) 2 I I y ( I y ]. The Jacobian is then ] ) 2 ] [ I, I ] [ 1 0 0 1 ] 28/38 The image windows with varying derivatives in both directions. Homeogeneous areas are clearly not suitable. Teture oriented mostly in one direction only would cause instability for this translation. What are the good points for translations? The matri H = ( I ) 2 I I y I I y ( I y ) 2 Should have large eigenvalues. We have seen the matri already, where? 29/38 Harris corner detector [2]! The matri is sometimes called Harris matri. Eperiments - no occlusions 30/38

Eperiments - occlusions 31/38 Eperiments - occlusions with dissimilarity 32/38 Eperiments - object motion 33/38

Eperiments door tracking 34/38 Eperiments door tracking smoothed 35/38 Comparison of ncc vs KLT tracking 36/38

References 37/38 [1] Simon Baker and Iain Matthews. Lucas-Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3):221 255, 4. [2] C. Harris and M. Stephen. A combined corner and edge detection. In M. M. Matthews, editor, Proceedings of the 4th ALVEY vision conference, pages 147 151, University of Manchaster, England, September 1988. on-line copies available on the web. [3] J.P. Lewis. Fast template matching. In Vision Interfaces, pages 120 123, 1995. Etended version published on-line as "Fast Normalized Cross-Correlation" at http://scribblethink.org/work/nvisioninterface/nip.html. [4] Bruce D. Lucas and Takeo Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Conference on Artificial Intelligence, pages 674 679, August 1981. [5] Jianbo Shi and Carlo Tomasi. Good features to track. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 593, 1994. [6] Carlo Tomasi and Takeo Kanade. Detection and tracking of point features. Technical Report CMU-CS-91-132, Carnegie Mellon University, April 1991. End 38/38