- Can you show me a simple example on vectorization on a MATLAB Cody problem?
- Is the vectorized code faster than the code using for loops?
MATLAB Cody is a good starting point for anyone, who wants to learn MATLAB. Multiple solutions can be given for a single task: it is worth to follow their evolution and efficiency.
The currently analyzed task is Cody problem #17:
Given an input vector x, find all elements of x less than 0 or greater than 10 and replace them with NaN.
An example input and output pair:
x = [ 5 17 -20 99 3.4 2 8 -6 ] y = [ 5 NaN NaN NaN 3.4 2 8 NaN ]
The straightforward solution is:
function x = clean_data(x) for i = 1 : length(x) if x(i) < 0 || x(i) > 10 x(i) = NaN; end end end
To measure the run-time of this code, first we generate a random input vector having 10000 elements:
x = floor(rand(1, 10000) * 20 - 5);
tic clean_data(x); toc
The output is:
Elapsed time is 0.09075 seconds.
Now, turn to another approach using vectorization. First, have a look at the following demonstration code:
% an example input for demonstration
x = [-1 5 11 8 20 2]
gt10 = x > 10 % logical indices of elements greater than 10 lt0 = x < 0 % logical indices of elements less than 0
% combining logical indices by using OR operator
indices = gt10 | lt0
% demonstration of selecting values using logical indexing values = x(indices)
% replace values selected by indices by NaN
x(indices) = NaN
The output is:
x = -1 5 11 8 20 2 gt10 = 0 0 1 0 1 0 lt0 = 1 0 0 0 0 0 indices = 1 0 1 0 1 0 values = -1 11 20 x = NaN 5 NaN 8 NaN 2
The most important thing to know, that when we compare a vector with a constant the result is a logical array having same dimensions as the input: it contains 1 (true), where the comparison is true and 0 (false) otherwise.
The steps in this piece of code are the following.
- Variable gt10 contains true, where the actual value is over 10, variable lt0 is for identifying elements lower than 0.
- Then indices variable is created by using the or operator: the resulting vector contains true where the value is not in the range of 0 and 10.
- The values variable is a demostration that logical arrays can be used for indexing an array: those elements are selected only, where the logical index is true. By indexing the x array with indices, we can read the appropriate values.
- In the last line logical indexing is used to set the values of that elements to NaN which are not in the range of 0 and 10.
After analyzing the steps above, we can write now the following solution:
function x = clean_data(x) x(x < 0 | x > 10) = NaN; end
The code became simple, in addition the for cycle was eliminated. After measuring the run time, the result is:
Elapsed time is 0.000185013 seconds.
The vectorized code is much more faster. It is worth to analyze this approach and use it in daily work.