Introducing FREAKs

Jun 20, 2012 4 min read

OK, starting to mix posts in English and French since the English version of the blog is still buggy.

Introduction

Hello everyone !

Let’s start this English part with good news for our lab. FREAK (Fast RetinA Keypoint), our new Local Binary Pattern¹ had a lot of success this week:

the authors of the original paper received the Best Open-Source Code Award at CVPR 2012! You can find the code here.
our paper showing preliminary results of LBP reconstruction (including FREAK of course) has been accepted for an oral presentation at ICPR 2012 next November²!

In a series of follow-up posts, I’ll talk more about the reconstruction paper (and especially the choices in the implementation), so let’s have a quick look at what’s all this about.

Forward transform

The forward transform is quite natural: the goal here is to describe an image patch by an as-compact-as-possible-yet-highly-efficient descriptor.

But what’s an LBP ?

In the case of Local Binary Patterns, the descriptor is a binary vector, i.e. a vector of 0’s and 1’s, where each bit tells us the sign of the difference between the integral of two subregions³.

Local Binary Descriptor

Hence, LBPs are very compact: they store 1 bit where other descriptors store 1 floating-point value (4 bytes). Furthermore, they are computationally very efficient:

to compute them, one just needs integral images (which are fast and easy to compute), then a subtraction and a sign test. Even modest hardwares (e.g. smartphones) can easily compute LBPs.
to compare 2 LBPs, the natural distance is the Hamming distance (the number of 1’s in the result of a XOR b), which is also very fast to compute (there may be some CPU optimized versions available, depending on your compiler).

LBPs are compact and efficient image patch descriptors. Associated with easy-to-compute keypoints such as FAST, they will probably become ubiquitous in mobile object recognition and image matching softwares in the near future.

What makes FREAK so special ?

FREAKs have 2 particular points:

a special structure is imposed on the averaging areas that will be used as inputs to the difference-then-quantize operator. This structure is inspired by the human retina: the spatial resolution is fine near the center, and becomes coarser when moving away from it (see figure below).
the pairs that are used in the difference process are not random. They have been retained after assessing their performance in a matching task on an image database.

FREAK pattern

And that’s all for this short introduction to FREAKs ! If you want more detail, see thereferences below or feel free to use the comment form.

Backward transform

Inspired by the work of Weinzaepfel et al. [WJP11], we asked ourselves wether it was possible or not to infer the original image patches from the LBP. However, while Weinzaepfel’s paper describes an algorithm where the correspondences descriptor - patches are first learned from an image database, we decided to directly address the reconstruction task by an inverse problem approach.

Coming soon !

Since this post is already too long, I will keep the description of the reconstruction algorithm for a follow-up post.

The implementation won’t be online very soon, because I still have some work to do with it⁴ and I want to release a final and complete code only. Still, I will add a post next week to describe my solution to the most difficult parts of the implementation: while the algorithm is almost standard now (primal-dual solver of Chambolle and Pock), the transpose of the LBP operator needs to be coded carefully to avoid the saturation of the memory.

Stay tuned, and don’t miss the next posts by subscribing to this blog’s feed or by following me on Twitter!

References

The FREAK home page
[AOV12] Alahi, A., Ortiz, R., Vandergheynst, P., Fast Retina Keypoint, presented at CVPR 2012.
The original code on github
[DAV12] Reconstructing FREAKs: d’Angelo, E., Alahi, A., Vandergheynst, P., Beyond Bits: Reconstructing Images from Local Binary Descriptors.
[WJP11] Weinzaepfel, P., Jegou, H., & Pérez, P. (2011). Reconstructing an image from its local descriptors. (pp. 337–344). Presented at the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). preprint
[CP10] Chambolle, A., & Pock, T. (2010). A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision, 40(1), 120–145. preprint

Well, I prefer the name Local Binary Descriptor, which I find more accurate. ^{^}
And I’m quite happy to go back to Japan :-) ^{^}
After normalization by the area. ^{^}
Mostly about the 1-bit reconstruction. ^{^}