09 May 2006

New Chopin images

Earlier this week, on the old training set, I improved the peak detector to avoid the need to make such strong assumptions about fixed lengths of bars; a fringe benefit was an improvement to the margin detector. All of the earlier work I did on adaptive binarisation for GEMM proved very helpful!

The new training set for Chopin images is ready, and I now have a copy on my machine. It presented an unexpected challenge: black scanning borders around the colour bars. I added a new pre-processing step to remove the black borders before the procedure tries to identify the page margins. So far, it seems to be working.

In order to account for rotation, I am refactoring the staff-system locator to use not the average column of the image but the average of the autocorrelations of 30 random samples. Next, I need to add a detector of connectedness of barlines between staves of a system.

After that, we shall see whether the system requires any further modification to handle rotations. The ideal solution would be to use a polar FFT, and a lot of research has been done on that lately. I'm not sure whether any good libraries exist for it, although it has spawned much research. We'll cross that bridge when we come to it.

17 December 2005

Where to now?

Gatos et al. post-process their results with a series of dilation and erosion filters to remove speckles and improve connectivity. Their technique is valuable independent of binarisation and is a good direction for future development.

Thresholding à la Gatos et al.

My toolkit for Gamera now has nine new functions:

which estimates the greyscale background of a greyscale image
which thresholds according to the paper by Gatos et al listed below
which computes the mean of the pixel values in an image
which computes the variance of the pixel values in an image
which computes the local mean of the pixel values surrounding each pixel of an image
which thresholds according to Niblack’s adaptive method
which thresholds according to Sauvola’s adaptive method
which computes the local variance of the pixel values surrounding each pixel of an image
which de-noises the image with an adaptive Wiener filter
These functions allow for a complete implementation of the method outlines in Gatos et al. The method contains four steps, which are illustrated here for the image displayed below.

Original image

The first step is to de-noise the image with a Wiener filter. The standard kernel is 5×5, but due to the level of debris in these images, it might be worth experimenting with larger kernels.

De-noised imaged after applying a 5×5 Wiener filter

Following the Wiener filter, the method requires a preliminary estimation of the foreground and background regions. Gatos et al. use Niblack’s method.

Binarised image using Niblack’s method

This estimation is combined with the original image to estimate what the background (e.g., the paper, stains and all) looks like in greyscale. Foreground regions seem to bleed into the background regions for my images, and so it might be worth experimenting more with these parameters. If there were a place to alter the model more radically, it might also be here: no adjustment of parameters can make up for the fact that any foreground pixel missed in the preliminary estimation will bleed into the background estimation under the model of Gatos et al. For their documents, it would seem that Niblack thresholding produced a superset of the foreground pixels, but that has not been my experience with microfilm scans of early music.

Estimated background

Once the background has been estimated, the threshold varies adaptively according to a sigmoid function on the estimated background value. Here again mistakes in the background estimated reduce the performance of the method overall. Nonetheless, even without any specific adjustment for our domain, the Gatos method appears to outperform other standard algorithms, such as Sauvola’s improvement on Niblack thresholding, Otsu’s global thresholding algorithm, and Bernsen’s adaptive thresholding algorithm. Bernsen thresholding would be the best alternative, but a close examination shows how it fails to account for the darkening of the background where the binding of this manuscript caused shadows.

Binarised image using Gatos’s method

Binarised imaged using Sauvola’s method

Binarised image using Otsu’s method

Binarised image using Bernsen’s method

21 November 2005

Comparison: Otsu's method

For comparison, below is the result of Otsu's global thresholding method on the same image as the samples below:

Otsu's method

Another paper

Here is another paper for the EndNote database:

  1. Yanowitz, S.D. and A.M. Bruckstein. 1989. A new method for image segmentation. Computer Vision, Graphics, and Image Processing 46(1).
It details a gradient-based algorithms for cleaning up binarizations. It is not part of the Gatos process, but it might be useful to add it to our collection in Gamera once that system is finished.

Sample images: adaptive thresholding

Nibalack's and Sauvola's methods for adaptive binarisation are two classics of the field and have just been implemented for Gamera. Here is a sample of each algorithm at work (using global bounds for very light and very dark regions):

Original image

Binarisation from Niblack's method

Binarisation from Sauvola's method

The next step is to implement smarter pre- and post-processing around these algorithms à la the Gatos paper.

20 November 2005

Niblack and Sauvola thresholding are now working in Gamera! The results are OK but not great, as would be expected. Samples will be available soon.

24 October 2005

New binarisation article

This new article on adaptive binarisation is a beauty:

  1. Gatos, Basilios, Ioannis Pratikakis, and Stavros J. Perantonis. 2004. An adaptive binarization technique for low quality historical documents. Lecture Notes in Computer Science 3163: 102–113.
I found it while looking for an implementation of the Niblack algorithm because our library does not own the book. I have added the article to the EndNote database.

I think that a next step in our project could be to implement the Niblack and Sauvola adaptive binarisation algorithms (both classic and simple). After that, we could consider adding the other algorithms in the Badekas paper and the extra techniques in the Gatos paper.

Performance-based metrics are important, and I think that our best bet at this point is to feed the output of these algorithms into the existing kNN classification framework in Gamera. The Gatos paper uses an existing OCR system to evaluate the results of their work versus previous algorithms.

21 October 2005

The GEMM page is up! Check out http://coltrane.music.mcgill.ca/GEMM/.

08 October 2005

The Shih-Pu skeletonisation algorithm works! Moreover, it is quite fast.<\p>

The primary problems were stack overflows due to uncontrolled recursions. The algorithm paper is, as has been noted in our group e-mail exchanges, somewhat underspecified, which was the root of the error.

The other problem is that the skeletons simply aren't connected. So far as I can tell, this is due to a logical error in the proof given in the paper. They are, however, reasonable approximations of the shapes of the input.

Two examples are below: The algorithm returns the skeleton with more information than the one-bit, thresholded versions presented here. The second image was converted to greyscale and thresholded with otsu_threshold prior to skeletonization. (DjVu thresholding proved too grainy.)