# Concurrency (Without Bugs) Is Hard!

Story of a bug that I introduced in my codebase while making it parallel. Hopefully, I could figure it out quickly (thanks to some well placed assert’s), but it took me a little but more time to understand what what happening.

## The context

The problem that I was working on could be summarized as follows:

• let’s say that we have a procedure (a signal processing filter for example) that knows how to operate on some data
• and now pretend that we want to re-use that procedure (because we like it and it’s well tested) but with different types of data (new measurement instrument, new data set…) that will need some specific pre- or post-processing.

Since our core routine is supposed to be well tested and doing its job right we don’t want to modify it in any way (not even by copying its code to adapt it to our new data). A solution to this problem is to design a class that will provide a public method that allows to call our wonderful routine and that can be subclassed (specialized) in order to take care of applying the correct pre/post-processing.

As Herb Sutter suggests here, it’s a good idea in this case to keep the public method that will be called by clients non-virtual, while the internal pre/post-processing methods are declared as private and virtual. The rationale behing is quite simple: the sequence of commands executed in the public function does not vary1, so it doesn’t have any reasons to be virtual.2

This yields us with the following base class:

Some subclasses are used to specialize what happened inside the pre/post-processing methods, for example:

## The issue

So far, so good.

In order to spend less time glaring at my screen and waiting for a result, I decided to run several threads in parallel, each of them applying a solver object on a line of an image. This was easy to set using the std::thread class from C++11. I thought I was all done (I did even think of copying my solver object so that each thread did hold its own copy to avoid conflicts between them) but… it turned out that the threads were always using solvers of the base class. The specialized subclasses seemed to be ignored :-/

However, when I checked, I was constantly passing one pf my specialized solver classes as arguments to the main routine…

## The solution

### Taking some time to think about it…

Actually, I’ve provided much of the solution in the previous section:

I did even think of copying my solver object so that each thread did hold its own copy

This is precisely the mistake:

• my main routine (before dispatching the work to the threads) was inherited from my single-threaded implementation and expected a reference to an object of the base class;
• when used directly from the main routine, the virtual function mechanisms work correctly and call the specialized implementations of Derived::preprocessing();
• however, when copy-constructing solvers in the creation of the threads the compiler generated a call to the copy-constructor of the Base class, instead of the Derived class, because I was using a reference to a Baseobject inside this routine.

### Reproducing the bug

This behaviour can be reproduced using these simple functions and the associated main:

You can find the whole project at this gist.

## Consequences

Hopefully, this bug did not take me long to figure out and fix (thanks to some well placed asserts in the code). The fix was easy: instead of creating the solver instances outside the threads, I created them inside using a small factory function. Since this factory takes only 1 parameter, it did not even cripple my thread creation calls with lots of additional arguments.

### Lessons learned:

• Beware of copy-constructed instances of object! They can be tricky to get right in the context of polymorphism…
• Concurrency is always harder than you thought!
• It’s good to have some debugging hooks/well placed assertions to enforce and check your assumptions… especially when moving from sequential to paralle execution ;)
1. It’s always 1) pre-processing, 2) processing, 3) post-processing.

2. This is also known as the Non-Virtual Interface idiom.

# Compiling OpenCV 3.0 Alpha With CUDA Support on MacOS X

A quick tip for people who have some troubles compiling OpenCV 3.0 alpha on MacOS X.

# Local NL-means, a First Experiment

This is (finally) a follow-up to my latest post!

### From the previous episode…

In a previous post, we have talked of a recent paper that was investigating the locality of NL-means. This paper by Simon Postec pointed out the following paradox:

The more NL-means is nonlocal, the less it is a good denoising algorithm.

Indeed, this phenomenon was seldom observed and even less investigated, since for computational reasons most NL-means (and derivatives) are experimented with small learning neighbourhoods only. So, shall we consider that NL-means is local after all?

### Experimental confirmation

In this post, we will confirm experimentally the results from Postec et al.

I will also take advantage of the post to present an implementation of NL-means in Python, because:

• I want to learn and practice Python, and that’s a good reason per se!
• I didn’t find any code from Simon Postec or his co-authors, so let’s make a bit of Reproducible Research1.

So, let’s dive into the code!

# Extracting patches

NL-means is a patch-oriented processor. Obviously, we’ll need a patch extractor. Here is a simple one, that applies mirroring on the image boundaries:

My personal style for implementing NL-means is a bit unusual: I collect all the patches first and I store them in a matrix. Regular implementations usually build the patches on the fly. Building the patches on the fly has some advantages on lower-memory devices2, but nowadays RAM is cheap and building the patches first allows for easier parallel processing later.

# Comparing the patches

Once we have patches, NL-means will denoise each pixel by comparing the corresponding patch with the patches of all the pixels in a learning neigbourhood centered on this pixel. The denoised value will be the weighted average of all these pixels, where the weights are given by the formula:

$w(P(\mathbf{x}), P(\mathbf{y})) = \frac{1}{Z} \exp \left( - \frac{\| P(\mathbf{x}) - P(\mathbf{y}) \|_2^2}{h^2} \right).$

$P(\mathbf{x}), P(\mathbf{y})$ are the two patches to be compared, $h$ is a filtering parameter, and $Z$ a normalization constant (the sum of all the weights in the learning neighbourhood).

For simplicity, I have omitted the multiplication of teh patches by a Gaussian window (that has for effect to emphasize the center part of the patch), but in practice this yields only slightly different results.

The only difficulty is how to deal with the central pixel. In the weight formula, the central pixel will always have a weight of 1 and will reduce the efficeincy of the denoising. Thus, I will not compute a weight for this particular location, but will assign it the maximum of the weights computed inside the learning window.

# Getting the final denoised value

The denoised value for the pixel at hand is obtained by taking the weighted sum of all the pixels in the learning window, adding the value of the central pixel (multiplied by the maximum weight in the neighbourhood) and dividing by the sum of the weights. In other words, a barycenter.

Et voilà !

The final routine is really simple: computingthe bounds of the learning windows, computing the weights, assigning the barycenter as output. And this is probably the most stunning part of NL-means: its simplicity.

## Experimental results

I’ve added a Gaussian noise of standard deviation 0.2 times the dynamic range to an image of Lena3. Then, I’ve fixed the value of the $h$ parameter to 2 and run NL-means for two different patch sizes: $5 \times 5$ and $7 \times 7$ pixels. Finally, for each patch size I let the learning neighbourhood size vary from $3 \times 3$ to $21 \times 21$.

The following figure shows the MSE for each patch and learning neighbourhood size. The neighbourhood size 0 depicts the original noisy image, and the horizontal axis depicts the half learning neighbourhood size. The final learning neighbourhood size is given by twice this value plus 1.

As can be seen from the graph, both patch sizes exhibit the same phenomenon: when the learning neighbourhood gets bigger than $5 \times 5$ pixels, the denoising performance starts to decrease. This is slightly worse than the result in Postec et al., but this can be due to the lack of a Gaussian window, a bad choice of $h$ or simply by chance since I used only 1 test image and 1 noise pattern.

### Conclusions and future post

In this post, we have seen an example of an easy, non-optimized implementation of NL-means. Using this implementation, we have been able to confirm the observations from the Postec et al. paper: NL-means efficiency is reduced when the implementation gets more non-local!

## Show me the code!

I plan to release the full code package when the post series is complete. I would like to make another two or three posts4.

## Next post

In the next post, I will make a small experiment to confirm Postec’s intuition about the locality of NL-means: the nonlocal hypothesis is more easily violated far from the pixel to be denoised.

1. Spoiler: it works.

2. As was the case when NL-means was first published.

3. Yes, let’s be original!

4. Which should take me 2 or 3 years at my current blogging rate :(

# Staying Alive…

Ooops, it’s already end of May and… the first post for 2014.

# Breaking News: NL-means Is Local!

The Non-Local Means algorithm 1 is a very popular image denoising algorithm.

Besides its efficiency at denoising, it’s also one of the main causes to the non-locality trend that has arisen in Image Processing conferences ever since. Yet, there is this recent paper on Arxiv 2: Non-Local means is a local image denoising algorithm by Postec, Froment and Vedel.

### Some background on NL-means

NL-means is a very simple, yet highly efficient image denoising algorithm that preserves well image textures even with severe noise. In a nutshell, it computes the denoised version of a pixel as a weighted average of all the other pixels in a learning neighborhood around it. The weights are proportional to the similarity of the Gaussian patches extracted around the pixel to denoise and the current candidate from the learning neighborhood.

You can find an introduction to NL-means in these blog posts: 1 2 3.

Since the learning neighborhood can be as big as the whole image, the result of the weighted average is said to be a non-local mean. However, since this greedy search process has to be applied to each pixel of an image, the size of the learning neighborhood is usually kept much smaller than the image 3.

Most of the literature since has focused on accelerating NL-means via some pre-selection process in order to consider in the learning neighborhood only the patches that have a significant contribution, thus ensuring that CPU time is spent only doing useful computations. However, the consequence is that the validity of the non-local approach of NL-means has never been fully studied.

### Summary of the paper

The goal of this paper is to check the validity of the following claim:

Small learning neighborhood sizes are just a computational artifact. A non-local algorithm should explore the whole image to get the best results.

This is achieved by measuring the average PSNR of the denoising result over 50 images for different sizes of the learning neighborhood. he patch size is set to 7x7 pixels. The test images are corrupted by Gaussian noise of standard deviation 204.

# Tested algorithms

Three variants of NL-means are compared in the experiments:

• the original one. Let’s call it NL-means;
• a modified version where only the 80 most similar patches are used to denoise a pixel, instead of all the patches in the learning neighborhood. Let’s call this one VIP5-NL-means;
• an ideal version of NL-means where the weights of the patches are given by the true, noiseless image. This one is called Oracle NL-means in the paper.

The last two algorithms are also tested in another variant where the value of the central pixel is not used when computing the weight between two patches. This small modification diminishes the bias introduced by the value of the central pixel in a patch: since the patches are modulated by a 2D Gaussian before computing the weight, the central pixel value has more importance than the others.

# Experimental results

The average PSNR curve (Figure 1 of the paper) then refutes the initial claim. It clearly shows that when the radius of the learning neighborhood is bigger than a 5, the average PSNR decreases for both NL-means and VIP-NL-means. This is not changed by taking into account or nit the centrak pixel of the patches. Oracle-NL-means, on the other hand, does not exhibit this behaviour.

# Not quite unexpected

It is actually not the first time that the local behavior of NL-means is observed. It is however the first time that a study is dedicated to this particular point.

For example, it was noted before by J. Salmon. On my side, I observed it several years ago (circa 2006) while working wit Jean-Michel Morel on a super-fast NL-means that was completely non-local, and it does also have a dedicated figure in my PhD thesis. 6.

# A probable cause

The common explanation for this phenomenon is that many patches within a small distance from the reference patch are wrong matches. For example, patches that include pixels on both side of an edge can be slightly shifted without an important weight penalty, in particular when there is a lot of noise! 7.

By enlarging the learning neighborhood, more of these spurious matches are found, thus yielding this apparent paradox:

The more non-local NL-means is, the worst are the results.

Postec et al.’s paper backs this explanation by showing experimentally that the results of oracle NL-means do actually improve with the size of the learning neighborhood, unlike genuine NL-means. Thus, when the patch weights are perfectly known a priori, there is no penalty implied by being truly non-local.

# How comes it’s never written?

It is seldom remarked by researchers for two major reasons:

• people tend to stick to small learning neighborhood sizes to keep the computations fast enough, thus avoiding (on purpose or not) bad sizes. For 7-by-7 patches, Postec et al. measure an optimal learning neighborhood size of 4 pixels;
• most of the effort that is put in speeding-up NL-means aims at finding a good pre-selection function. The goal of this function is to kick out of the weighted mean non-relevant patches, thus limiting this adverse phenomenon.

# What can we do with this information?

The full consequences of Postec et al.’s paper still have to be found, but there’s already one positive side effect. It’s probably a good vindication of BM3D and likes’s 2-stage denoising process:

• the first stage is an aggressive non-local denoising procedure that is too strong and has undesirable side effects;
• the second stage computes a better non-local denoising result by using the intermediate output as the basis for the weights estimation.

Since the first stage applies a harsh denoising function, it estimates an image with very sharp edges and flat regions. This avoids the small variations of patch distances caused by the noise and makes the geometric and textural features more important.

### Conclusion: is it the end of non-locality?

Clearly, it’s not the end of NL-means. Besides the fact that it’s still hard to beat it in pure denoising tasks, NL-means had the invaluable interest of provoking this huge non-local trend and has lead to the study of new approaches to Image Processing tasks, some of which are clearly promising8.

It is also remarkable that such a simple algorithmic scheme can still have unexpected behaviors almost 10 years after its initial disclosure! This clearly shows our poor understanding of Image Processing and image models and bids us to stay humble.

In a next post, we’ll check the results of Postec’s paper, hopefully using an IPython notebook9. Stay tuned!

If you don’t want to miss the next post, you can register to the blog’s feed or follow me on Twitter !

1. Let’s say NL-means for short in the remainder of this post.

2. In French, published first on Nov. 15th, 2013

3. Hey remember, NL-means was published in 2005 in CVPR’05 / SIAM MMS. That’s two years before the first iPhone, and at that time CPU and RAM were still expensive.

4. The initial dynamic range of the images is the classical [0,255] range.

5. Very Important Patches

6. I’m not telling you where you’ll have to read it :-p

7. This is very similar to the adherence phenomenon in block matching.

8. I think of the non-local dictionary based algorithms for example here.

9. Or on github if I don’t manage to do it.

# Lecture en Ligne : Computational Photography (Stanford CS478)

Le week-end s’annonçant pluvieux, pourquoi ne pas l’occuper en s’instruisant ?

# Book Review: Instant OpenCV for iOS

Disclaimer: I’ve been a reviewer for this book.

## Some context

A few months ago, I was kindly asked to review a book to-appear that was entitled Instant OpenCV for iOS, that is now available (at least as an eBook). The book is edited by Packt Publishing, where I bought a few books before, in particular two previous books about OpenCV, an introductory one (by R. Laganière) and a more application-focused one (by several CV bloggers).

Packt Publishing is one of the very few editors that propose several books about OpenCV. In particular, since the iconic Learning OpenCV by Gary Bradski that you can find in every CV lab, and almost on any CV student desk is now clearly outdated1 and still waiting for its second edition, Packt Publishing has probably the most interesting choice of books on the topic2.

Anyway, I was glad to accept, although the time frame for the review was very short. Imo, a bit too short, especially if you compare to the standard review delays in scientific conferences. But, eh, I was in Taiwan at that time, some sleepless nights and jetlag to fight, so I checked in.

## The authors

The book is written by two OpenCV specialists, Alexander Shishkov and Kirill Kornyakov. If you follow OpenCV developer’s site, then you probably know them already, since they are employees of Itseez, the company that supports OpenCV today, and they are the authors of many commits into the library.

So, we have two first class OpenCV connoisseurs here! Furthermore, they work on code optimization for mobile platforms, which can be tricky.

## What’s inside

The book is a short one: 96 pages, divided in 17 recipes (about 5 pages per recipe). This is the format of the Instant… series: the goal is to get started very quickly from a few examples.

Each recipe comprises a short description of the task and an overview of the approach (when it is complex), immediately followed by the corresponding code or step-by-step description of the required actions in the Xcode IDE. Then comes a bit of explanation of what the code is doing, and the recipe ends with some suggestions of applications or more advanced steps to pursue.

The recipes are focused enough to be treated in just a few pages: they range from the very simple Getting started with iOS to the complex Optimizing the performance with ARM NEON extensions.

I have mixed feelings about this book: some things are really cool and interesting, while a part of the book does not seem really useful by trying to reach too many different audience.

## I liked…

• I really liked the most advanced recipes. You can learn a few things that are terribly difficult to find online, especially a Neon optimization how-to and the usage of the Python build script to get your own build of OpenCV.
• An important part of the book is disclosed here for the first time. This is a big advantage over the collective Practical CV projects that included some material available on the authors’website and even in OpenCV documentation.
• I’m also a fan of the approach of the book, where you gradually build up some kind of Image Processing library and expand it with more-and-more complicated projects. This teaches you a lot about design at the same time as you learn functions of the library.

## That may be embarrassing…

• On the other hand, I’m definitely not a big fan of the Instant… series format. It seems to me that a recipe starts with a big chunk of code, hard to understand, then you have (some) explanation. At least, I’d have rather interleaved explanation, and ideally I would like a bigger, more peaceful book.
• Some recipes seem to me overly simplistic. The book tries to introduce iOS development, OpenCV usage and mobile code optimization at the same time. However, my guess is that this is not a book for beginners. Thus, easier parts that are also covered extensively on the Web (such as installing Xcode and running code on an iDevice) should have been dropped to make room for more advanced recipes or more explanation.
• I believe most of the code examples should be iOS7 compliant, but honestly I didn’t try.

### Summary

If you’re an experienced iOS developer or OpenCV user who has decided to get serious with mobile Image Processing (for an augmented reality app maybe?), then this book is definitely for you and will demonstrate you some useful tricks to get the most out of your device. With mobile 3D scanning coming soon, you better be ready!

If you’re a beginner with OpenCV, then you should prefer Laganière’s book, which is still the best introductory book to OpenCV 2 for the time being.

1. It is tied to OpenCV 1.

2. I’m not writing that because I reviewed one of their books, I’m writing that because Laganière’s book was one of the very first reliable source to really learn OpenCV 2. Even today, the results for a query on “OpenCV tutorial” are full of tutorials about the version 1.x of the library.

# Ambiance 101

Pour faire plaisir à Rémi qui s’inquiétait sous un billet récent de savoir si –la Corée– Taïwan était jolie, quelques images et sons d’ambiance de mon dimanche off’ à Taipei.

# My PhD Thesis Is Online!

Ca a pris un peu de temps1, mais mon manuscrit de thèse est accessible en ligne librement depuis quelques jours ! Bien évidemment, c’est un must si vous vous intéressez aux méthodes variationnelles non-locales :-p

Blague à part, le manuscrit (en Anglais) est donc disponible ici.

Au cours des semaines qui viennent, je rajouterai quelques notes pour décrire grossièrement les idées fortes de mon travail. N’hésitez donc pas à revenir ! En attendant, un avant-goût avec l’image de Lena ci-dessous…

### English version

OK, it took me some time, but my PhD thesis manuscript is now available online on EPFL’s publication site! Of course, it’s a must read if you are interested in variational non-local image processing approaches :-p

Well, at least now you can go here and make a mind of your own. Special thanks go to Matt from Books books books for the careful proofreading and syntax and spelling corrections.

I’ll add a few notes to describe roughly the main ideas of my work. In the meanwhile, here is a small Lena teaser…

1. J’avais égaré le formulaire de mise en ligne…

# Office Hours

Il y a quelques temps, j’ai donc dû me rendre à Taïwan dans le cadre de mon travail pour aider notre partenaire local à peaufiner les réglages d’un système livré chez un client. J’avoue y être allé avec un peu d’appréhension, probablement à cause de nos préjugés d’Européens sur l’Impitoyable Monde du Travail Asiatique (et aussi pour avoir un peu fréquenté les masses laborieuses japonaises et coréennes lors d’autres voyages). Bref, la pression du client aidant, je m’attendais un peu à du 24/7, enfermé dans une cave sans jamais en sortir.

Eh bien, non, presque pas ! Presque, parce que c’est vrai que j’ai assez peu vu le soleil, la nuit tombant rapidement sous ces latitudes et mes journées étant un peu longues afin de rentabiliser le voyage au maximum. Mais par contre, pour le reste, en-dehors d’un samedi au bureau et de journées à rallonge (car se finissant au sortir du resto, pas du bureau), c’était somme toute très normal.

Voici donc quelques observations faites sur place, dans 2 entreprises d’électronique de taille petite à moyenne1.

### Les horaires

Petite surprise, les horaires sont “normaux”. Normaux par rapport à ce que j’ai pu voir en Europe pour les gens qui sont au forfait jour : grosso modo, arrivée vers 9h pour réellement commencer à travailler un quart d’heure plus tard, départ entre 18h et 19h pour la plupart.

Dans une entreprise que j’ai visitée, une musique signale même la fin de la plage horaire obligatoire aux employés (vers 17h30 - 18h de mémoire 2). Et puis bon, en cas de dépassement d’horaire conséquent, une des entreprises était même située dans le même bâtiment qu’un hôtel :-p

## La sieste

Midi ! Toit le monde s’arrête pour manger. Ceux qui ne s’en sont pas rendus compte sont rappelés à l’ordre par le clignotement des lumières. Repas plus ou mois vite avalé selon les gens, sur place (maison ou à l’emporter) ou dehors dans un des nombreux restaurants.

Mais la véritable surprise survient après le repas : une petite musique douce s’élève, et les lumières s’éteignent… Eh oui ! C’est l’heure de la sieste, assis sur sa chaise (et la tête dans le clavier). Une demi-heure plus tard, la même petite musique et le retour des lumières signalent la fin de la sieste.

Alors, particularisme de l’entreprise visitée ? Difficile à dire, faute d’avoir visité d’autres entreprises. En tout cas, pas facile de déroger à l’institiuion de la sieste, et c’est une pause bienvenue dans la journée pour se remettre après un bon bol de beef noodle ou de stinky tofu.

Et pour faire bonne mesure, un jour on a été à la bourre, et le temps de midi s’est transformé en Burger King avalé dans la voiture sur l’autoroute.

## Le week-end

Ayant eu droit à mon samedi au bureau, je me suis senti un peu seul, puisqu’il y avait au maximum 5 personnes présentes le matin (et encore moins l’après-midi). La climatisation étant coupée malgré l’été moite, j’en conclus que travailler le samedi n’est pas vraiment la norme. Et d’après les discussion avec les collègues, le dimanche tend à être réservé aux activités en famille exclusivement.

Un samedi au bureau sous la pluie

Au final, la semaine de travail est donc “normale”.

Cadeau du client pour m’avoir fait travailler le week-end :-)

### Bon alors, le paradis des travailleurs ?

Sieste institutionnalisée, bureaux fermés le week-end… Alors, est-ce que Taïwan serait un paradis des travailleurs ? Pas tant que ça : il y a quand même des choses qui feraient hurler ici (comprendre : en Europe occidentale).

Par exemple, le week-end est préservé pour la vie familiale… parce qu’en semaine, elle disparaît vite. Ce n’est pas contradictoire avec le fait que les horaires de bureau soient “raisonnables”. En effet, il faut rajouter le temps de transport pendulaire (facilement 1h à Hsinchu) et les repas / sorties d’entreprise qui peuvent manger un bon bout de la soirée3. Difficile aussi de trouver des collègues locaux ayant des enfants et dont la femme travaille, c’était souvent soit l’un soit l’autre : effet de loupe ou réalité statistique ?

1. Disclaimer : ces observations sont valables pour les entreprises où je suis allé, donc plutôt du genre bureau d’études en électronique où les employés sont surtout des ingénieurs ou équivalents. Bien sûr, les ouvriers sur une chaîne vont avoir des horaires et conditions de travail différents, comme les employés des restaurants, etc. (même si la chaleur estivale impose de toutes façons la sieste à l’ombre en milieu de journée, et si les restaurants de Hsinchu ferment leur cuisine relativement tôt).

2. Entre le décalage horaire et mes horaires à moi, j’ai eu du mal à retenir ce genre de détail).

3. D’ailleurs, en parlant de manger, il est difficile de trouver un restaurant qui sert encore à manger après 21h ou 21h30 à Hsinchu.