Looking at the features that were extracted earlier, they didn't seem to be providing much information. It's quite difficult even for a human to look at those features extracted and say that they belong to a motorbike. So I compared the results of my feature detection phase (which looked mostly like this) with the results of feature detection from Rob Fergus' paper which looks like this:
The problem seemed to be the scale of the features detected. Somehow small, local features were firing more strongly than more important larger features. I started gradually increasing the smallest scale admissible for detected features and finally settled on a starting scale of 23 (earlier it was 3). Using this value for starting scale and choosing the top 20 saliency values, the outputs on various bikes looked like this:
This seems much better and closer to the output of Fergus et. al. I extracted these newly detected features, resized them and tiled them into the image shown below. The 9 rows show the rescaled features (into an 11 x 11 patch) extracted from the 9 motorbikes shown above in row major order.
Now, we can at least see the tyres of the motorbike in almost all the input images. The new appearances of the parts seem to provide more information about the image's category.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment