23 February, 2014

A simple image segmentation example in MATLAB

Sometimes we need to identify pixels belonging to different objects. See the following image:

On the image there are three objects: a jumping man, the blue sky and the white snow. Suppose, that we want to segment the jumping man, so mark all the pixels belonging to the desired object. This is the basic goal of all the image segmentation tasks.

If we look at the image, we can see, that the easiest way to segment the man is using the color information. The sky is definitely blue, although it has a considerable vertical gradient from dark to light-blue. The snow is white, having some gray shadow on it. The man has some different colors, except blue. So, if we identify and remove the blue pixels, there are two big objects remaining: the upper will be the jumping man.

If you are not familiar with handling RGB images, please have a look at section Usual indexing into 3d matrices of this matrix indexing tutorial.

Open the image and visualize the three color channels. In addition we put them into three different variables:

image = imread('jump.jpg');     % read image

% get image dimensions: an RGB image has three planes
% reshaping puts the RGB layers next to each other generating
% a two dimensional grayscale image
[height, width, planes] = size(image);
rgb = reshape(image, height, width * planes);

imagesc(rgb);                   % visualize RGB planes
colorbar on                     % display colorbar

r = image(:, :, 1);             % red channel
g = image(:, :, 2);             % green channel
b = image(:, :, 3);             % blue channel

The result is:

On the blue channel the sky is brighter, the man is darker, so out first idea could be to cut the blue channel at a given threshold. All pixels under the threshold may belong to the objects:

threshold = 100;                % threshold value
imagesc(b < threshold);         % display the binarized image

Setting the threshold at 100, 110 and 120 gives us the following three results respectively:

At low thresholds we loose several pixels of the man. As increasing the threshold value, more and more sky-pixels fall under it, so the sky and the man can not be segmented any more. The problem is, that a pixel having high blue component is not necessarily blue. You can see some colors and their components below:

$$ \color{#ff0000}{(255, 0, 0)} \\ \color{#00ff00}{(0, 255, 0)} \\ \color{#0000ff}{(0, 0, 255)} \\ \color{#7f7fff}{(127, 127, 255)} \\ \color{#ff00ff}{(255, 0, 255)} \\ \color{#00ffff}{(0, 255, 255)} \\ $$

The last two colors have high blue components, but in fact they are not blue, since the other components are high, too.

Here comes the idea: actually we want not the pixels having high blue component, but we want blue pixels. If you look carefully the figure above, you can see, that a pixel is blue really, if the blue component is high and the others are low. Here is a simple equation for identifying the blueness of a pixel:

$$b = B - max(R, G)$$

See the result for the colors above:

$$ \begin{array}{rcr} \color{#ff0000}{(255, 0, 0)} & \rightarrow & -255 \\ \color{#00ff00}{(0, 255, 0)} & \rightarrow & -255 \\ \color{#0000ff}{(0, 0, 255)} & \rightarrow & 255 \\ \color{#7f7fff}{(127, 127, 255)} & \rightarrow & 128 \\ \color{#ff00ff}{(255, 0, 255)} & \rightarrow & 0 \\ \color{#00ffff}{(0, 255, 255)} & \rightarrow & 0 \\ \end{array} $$

This approach seems much better. Lets turn it to MATLAB code:

% apply the blueness calculation
blueness = double(b) - max(double(r), double(g));
imagesc(blueness); % visualize RGB planes colorbar on % display colorbar

The result below seems much better. We have only to choose a proper threshold to segment the blue pixels. Now 45 is chosen:

% apply thresholding to segment the foreground
mask = blueness < 45;


There are some objects remaining in the image, for example the snow and some noise. To remove them, we calculate a so-called label image, where each pixel belonging to the same object has the same value. We have an a-priori knowledge, that the pixel at (200, 200) belongs to the jumping man. We read the value of that pixel: all pixels having the same label belong to the man.

% create a label image, where all pixels having the same value
% belong to the same object, example
% 1 1 0 1 1 0      1 1 0 2 2 0
% 0 1 0 0 0 0      0 1 0 0 0 0
% 0 0 0 1 1 0  ->  0 0 0 3 3 0
% 0 0 1 1 1 0      0 0 3 3 3 0
% 1 0 0 0 1 0      4 0 0 0 3 0
labels = bwlabel(mask);

% get the label at point (200, 200)
id = labels(200, 200);

% get the mask containing only the desired object
man = (labels == id);

% save the image in PPM (portable pixel map) format
imwrite(man, 'man.ppm');

If we want to use the resulting mask for example for designing a logo, we may convert it to vectorgraphic format by using a great tool, called potrace. The input of potrace is a bitmap, the output is a vectorgraphical file.

potrace man.ppm -i -c -s -o man.svg

Option i inverts the image, c generates a textual output instead of a compressed one, s defines the SVG backend, while o is for giving the output file name. The original PPM image is jagged, but the final, vectorized object looks smooth and great:


The technique above is similar chroma keying, which is widely used for removing a single-colored background. Because we know the background color, we can easily remove it, cut the foreground, and use a different image as background.


New comment

comments powered by Disqus