Yingchuan Hu (yh537), Hang Chu (hc772)
CS 4670/5670 Introduction to Computer Vision
Oct. 1st, 2013
Compute Harris Features
We generally followed the instructions from the slides and the instructions. We first applied two filters on the entire image to get the derivative image of x axis and y axis, and we computed the harris matrix of each pixel in its neighborhood using a gaussian mask and the derivative images (treat the value as zero when the sliding window is out of image boundary). Then we computed the harris score using the matrix. Finally we take the eigenvector corresponding to the first eigenvalue as the orientation of the feature, transformed to radian by atan()
and atan()+pi
.
Adaptive Nonmaximum Suppression
As is described in the MOPS paper, we computed the minimum suppression radius for every local maximum in the harris image using Eq.(10)
, and we only took the N points with largest suppression radius as the feature point. Note that if we have two points share the same value in one square, no feature point will be extracted from that square window.
Simple Window Descriptor
For the window descriptor, just as the instruction on the website, we used a 5x5 window as the feature descriptor which takes the raw data of the grayscale image. In fact, it would be illumination invariant if it is normalized by substracting the mean and dividing by the standard deviation.
MOPS Descriptor
We followed the MOPS paper and instructions on the website. By applying the rotation transform and the Gaussian kernel before down-sampling and calling WarpGlobal to do the mapping, we transformed the 40x40 window of the grayscale image to a 8x8 feature with a horizontal dominant direction. After that, we did normalization for the values by substracting the mean and dividing by the standard deviation.
Custom Descriptor
Inspired by the simple window descriptor and MOPS descriptor, we thought of constructing a circle-shape descriptor. However, it's redundant to choose all the points inside one circle and it's not enough to choose points on circle only. In this case, we came up with the concentric circles form of the descriptor. Specifically, we tested and decided to choose three concentric circles with radius 6, 12, 18 correspondingly. And we chose the first point to be the one at the dominant direction and then looped counter-clockwise, which makes the feature rotation invariant. After testing, we found that it's better to choose a point every 20 degrees. The sample pixel was rounded to the nearest integer neighbor. So together, for each feature, we have 360 / 20 * 3 = 54
values to describe it.
To note that the performance of this new descriptor is not as good but close to MOPS.
Scale Invariance
In order to make the feature detector scale invariant, we applied the pyramid structure. Instead of using the structure provided in the ImageLib, we did that by ourselves. By scaling the original image to 1/2, 1/3, 1/4 and 1/5, we computed the features in different scales. And different from the method which is given in class by choosing the maximum value of the features in a square, I keep the features from all scales.(We noticed that the number of features get less and less quickly as the scaling variant alpha decreases to near 0, especially after we applied adaptive nonmaximum suppression.) After some analyses, we concluded that in some situations this method was better than choosing the maximum and concluded that merging the features from all scales was reasonable. Also, that won't bring about too much computation cost.
SSD Matching
This was already done by the framework.
Ratio Matching
To implement the ratio matching, we took the SSD matching as a reference. As we need to find the best and the second best, new variables were added to record the second best value and index. Assume dBest1
and idBest1
are used for the best match while dBest2
and idBest2
for the second best. Our algorithm can be described as follows:
if (d < dBest1)
update dBest2, dBest1;
else if (d >= dBest1 && d < dBest2)
update dBest2
And we get the ratio distance after looping through features.
Improved Ratio Matching
We noticed that the ratio test took advantage of the second best to make full use of the majority of information to distinguish between ambiguous matches. To note that there may be more than two ambiguous matches in one image, though the probability is low. In this case, we can improve the original ratio test by giving DISTANCE = p * FIRST_BEST / SECOND_BEST + (1 - p) * SECOND_BEST / THIRD_BEST
, which absorbs more information and has better statistic distributions. Notice that p > 1 - p
, and we choose p = 0.8
here.
To note that this new matching method has an average 0.014 increase in AUC for leuven, bikes and wall test sequences.
The ROC curves are displayed below (graf and yosemite).
And for all ROC curves including custom descriptor, improved ratio match and kd-tree improved ratio match, we plotted the ROC curves as follows:
leuven:
SSD | ratio test | improved ratio test | kd-tree improved ratio test | |
Simple Window Descriptor | 0.208008 | 0.500890 | 0.511718 | 0.439394 |
MOPS Descriptor | 0.632387 | 0.669069 | 0.681089 | 0.617840 |
My Own Descriptor | 0.632555 | 0.655203 | 0.667872 | 0.625959 |
bikes:
SSD | ratio test | improved ratio test | kd-tree improved ratio test | |
Simple Window Descriptor | 0.429870 | 0.523828 | 0.527999 | 0.564001 |
MOPS Descriptor | 0.677916 | 0.700415 | 0.721920 | 0.612995 |
My Own Descriptor | 0.622444 | 0.670107 | 0.687494 | 0.644212 |
wall:
SSD | ratio test | improved ratio test | kd-tree improved ratio test | |
Simple Window Descriptor | 0.307211 | 0.547159 | 0.552138 | 0.531165 |
MOPS Descriptor | 0.574394 | 0.608491 | 0.615165 | 0.539280 |
My Own Descriptor | 0.598584 | 0.588737 | 0.596019 | 0.563881 |
In spite of the basic requirements, we have implemented all the extra credits as follows:
1. Designed and implemented our own feature descriptor (By Yingchuan Hu)
// See Feature Description --> 2
2. Implemented adaptive non-maximum suppression (By Hang Chu)
// See Feature Detection --> 2
3. Made our feature detector scale invariant (By Yingchuan Hu)
// See Feature Description --> 4
4. Implemented a method that outperforms the ratio test (By Yingchuan Hu)
// See Feature Matching --> 3
5. Used a fast search algorithm to speed up the matching process (By Hang Chu)
// See Feature Matching --> 4
Q1: How to configure our project?
A1: As we applied kd-tree for matching based on the ANN:A Library for Approximate Nearest Neighbor Searching from David M. Mount and Sunil Arya (http://www.cs.umd.edu/~mount/ANN/) , you need to merge the code/lib and code/include with the original documents such that you have the required *.dll, *.h and *.lib files. Also, right click on Project Features in the Solution Explorer->Configurations->Linker->Input->Additional Dependencies, add ANN.lib and apply so that the linker will find the lib file.
Q2: How to run our program?
A2: Be sure that you have copy ANN.dll under the path of Features.exe. We have modified FeaturesMain.cpp to give you guidance on our feature types, descriptor types and matching types, which are also listed here as follows.
Featuretype (1: Dummy 2: Harris)
Descriptortype (1: Window 2: MOPS 3: Custom)
Matchtype (1: SSD 2: Ratio 3: Improved Ratio 4: Kd-tree Improved Ratio)