# Vectorization example: select pixels of a given area

During signal processing, it is often needed to select a subset of the input data: in this post a similar task is solved demonstrating vectorization of MATLAB code. Two variants of the vectorized algorithm is analyzed: surprisingly one of them is slower than the straightforward way using *for* loops.

## The problem

The input data is ordered into columns. For each column we calculate a so-called border point: pixels below this border belong to the area we want to select, while the others not. The goal is to select all points of the area.

### A demonstration for better understanding

In matrix **M** for each column we search the first point where the signal begins to increase, and store the appropriate row-indices:

M = [17 5 2 6 7; 5 6 413 2; 6 5 2 532; 021 0 7 3; 0 3 0 0 8 4] Y = [1 4 0 2 3 3]

The border points are marked by bold: their row-indices are stored in vector **Y**. To select the values in the segmented area, we shall use the following **A** logical matrix:

A = [1 1 0 1 1 1; 0 1 0 1 1 1; 0 1 0 0 1 1; 0 1 0 0 0 0; 0 0 0 0 0 0]

Write a function to generate the indexing matrix based on the border points using the following skeleton:

% generate matrix of logical indices for the area % segmented by the values in Y function A = generateAreaMatrix(rowsM, Y) % dimensions of the output A = logical(zeros(rowsM, length(Y))); end

## The straightforward solution

The straightforward solution is to loop through each column, and set the elements from the first point to the border to one:

function A = generateAreaMatrix_1(rowsM, Y) % create a matrix of zeros A = logical(zeros(rowsM, length(Y))); % loop through the columns for i = 1 : length(Y) % set the appropriate points of the ith column to one A(1 : Y(i), i) = 1; end end

A sample usage is:

>>A = generateAreaMatrix_1(5, [1 4 0 2 3 3])A = 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0

## A vectorized solution using *repmat* function

To understand the basic idea behind this solution is, please have a look at the following code sample:

rowsM = 5; % number of rows of the input y = 4 % index of a border-point indices = 1 : rowsM % a map containing row indices areaMap = indices <= y % points under border point

The output is:

y = 4 indices = 1 2 3 4 5 areaMap = 1 1 1 1 0

By creating a map of indices and comparing it to the index of the border point – which is always somewhere between zero and the number of the rows –, all indices below the border will turn to one. In the example above the border is the fourth point, so the first four elements of the five belongs to the area.

We can use the repmat function to turn this idea into a working code. We have to create a map containing the row indices and a map of the borders:

function A = generateAreaMatrix_2(rowsM, Y) r = (1 : rowsM)'; % vector of indices R = repmat(r, [1 length(Y)]) % matrix of indices Y = repmat(Y, [rowsM, 1]) % matrix of borders A = R <= Y; % do the comparison end

The output of this function with some debugging information:

>>A = generateAreaMatrix_1(5, [1 4 0 2 3 3])R = 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 Y = 1 4 0 2 3 3 1 4 0 2 3 3 1 4 0 2 3 3 1 4 0 2 3 3 1 4 0 2 3 3 A = 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0

Matrix **R** holds the row indices, while matrix **Y** contains the indices of border points. The result of the comparison equals to the desired output.

## Another vectorized solution using *bsxfun* function

Function bsxfun is a complicated one being explained in a future post. Basically it takes two inputs and applies an element-by-element operation given as a third parameter on them. An important thing is the following, cited from the MATLAB reference:

The corresponding dimensions of A and B must be equal to each other or equal to one. Whenever a dimension of A or B is singleton (equal to one), bsxfun virtually replicates the array along that dimension to match the other array.

It means, if we feed the following vectors into bsxfun, it repeats them so, as we did manually in the previous solution.

R = [1; 2; 3; 4; 5] Y = [1 4 0 2 3 3]

Column vector **R** will be repeated to match the number of columns in **Y**, while row vector **Y** will be repeated to match the row count of **R**. So the main part of our function turns to a single line:

function A = generateAreaMatrix_3(rowsM, Y) % @le stands for the operation that the second % parameter is less than or equal to the third A = bsxfun(@le, (1 : rowsM)', Y); end

The approach works like a charm:

>>A = generateAreaMatrix_1(5, [1 4 0 2 3 3])A = 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0

## Results of time measurement

The algorithms were run multiple times on a random input vector and the average run-time was calculated, as the code sample shows below:

rowsM = 255; Y = randi([1 rowsM], [1 2000]); runTime = 0; N = 100; for run = 1 : N tic; A = generateAreaMatrix(rowsM, Y) runTime = runTime + toc; end runTime = runTime / N

The measurement was done on different border lengths. The rows of the table contain the measured average time in milliseconds, while in the columns the length of the border is given:

1000 2000 5000 10000 25000Method #11.2 3.0 6.8 13.9 34.4Method #22.3 6.7 13.0 25.9 62.9Method #30.1 1.1 1.2 1.7 3.8

What is surprising, that the vectorized method using the repmat function is about twice slower, than the naive, non-vectorized way using *for* loops. It is because that method needs lots of memory and deals with large matrices which results in slower speed.

The method using bsxfun is far the fastest. Have a look again at the citation from the MATLAB documentation:

Whenever a dimension of A or B is singleton (equal to one), bsxfun **virtually** replicates the array along that dimension to match the other array.

In case of this solution repeating of the vectors is done virtually: we have to deal only with one large matrix for the output. That makes this approach very fast.

As a conclusion we can say, that vectorization is a very efficient way to get faster code, but we have to do it carefully: a vectorized code itself not always means faster runtime, as the second method clearly demonstrates it.