# [matlab] performance for-loops vs. vectorization vs. bsxfun

From time to time I explain my students certain concepts. To archive those and as an extended memory, I share them here. We also recently had some discussion on vectorization in our research group. e.g. in python and matlab. With the second link claiming for-loops in matlab are performing much better than before.

## Goal

Show that for-loops are still quite slow in matlab. Compare bsxfun against vectorized arithmetic expansion in matlab against bsxfun

## The contenders

• good old for-loop: Easy to understand, can be found everywhere, slow
• arithmetic expansion: medium difficulty, should be general used, fast
• bsxfun: somewhat difficult to understand, I use it regularily, fast (often)

## Comparisons

While demonstrating this to my student, I noticed that subsetting an array has interesting effects on the performance differences. The same is true for different array sizes. Therefore, I decided to systematically compare those.

I subtract one row from either a subset (first 50 rows, dashed line) or all rows of an [n x m] matrix with n= [100, 1000, 10 000] and m = [10, 100, 1000, 10 000]. Mean + SE

## Three take home messages:

• for loop is very slow
• vectorization is fastest for small first dimension, then equally fast as bsxfun
• bsxfun is fastest if one needs to subset a medium sized array (n x m >100 x 1000), but see update below!

## Update:

Prompted by Anne Urai, I redid the analysis with multiplication & devision. The pattern is the same. I did notice that allocating new matrices before doing the arithmetic expansion (vectorization) results in the same behaviour as bsxfun (but more lines of code)

A = data(ix,:);
B = data(1,:);
x = A./B;

## matlab code

tAll = [];
for dim1 = [100 1000 10000]
for dim2 = [10 100 1000 10000]
tStart = tic;
for subset = [0 1]
if subset
ix = 1:50;
else
ix = 1:dim1;
end
for run = 1:10
data = rand(dim1,dim2);

% for-loop
x = data;
tic
for k= 1:size(data,2)
x(ix,k) = data(ix,k)-data(1,k);
end
t = toc;
tAll = [tAll; table(dim1,dim2,subset,{'for-loop'},t)];
%vectorized
tic
x = data(ix,:)-data(1,:);
t = toc;
tAll = [tAll; table(dim1,dim2,subset,{'vectorization'},t)];
% bsxfun

tic
x= bsxfun(@minus,data(ix,:),data(1,:));
t = toc;
tAll = [tAll; table(dim1,dim2,subset,{'bsxfun'},t)];
end
end
fprintf('finished dim1=%i,dim2=%i - took me %.2fs\n',dim1,dim2,toc(tStart))
end
end

% Plotting using the awesome GRAMM-toolbox
% https://github.com/piermorel/gramm
figure
g = gramm('x',log10(tAll.dim2),'y',log10(tAll.t),'color',tAll.Var4,'linestyle',tAll.subset);
g.facet_grid([],tAll.dim1)
g.stat_summary()
g.set_names('x','log10(second dimension [n x *M*])','y','log10(time) [log10(s)]','column','first dimension [ *N* x m]','linestyle','subset 1:50?')
g.draw()

Categorized: Blog

Tagged:

1. · 22. November 2017 Reply

In R2016b and later, Matlab automatically applies arithmetic expansion so that bsxfun is no longer necessary: https://blogs.mathworks.com/loren/2016/10/24/matlab-arithmetic-expands-in-r2016b/ would be interesting to see whether the performance difference between bsxfun and vectorisation.

• · 23. November 2017 Reply

I think
x = data(ix,:)-data(1,:);
uses already the arithmetic expansion, I clarified this in the article. All calculations were performed in R2016b and will fail in earlier version. I repeated the analysis with multiplication and division and the general shape of the results are identical. I.e. bsxfun is faster if a subset is selected, without subset-indexing, the performance seems identical.

Prompted by your comment I now tried this:
A = data(ix,:);
B = data(1,:);
x = A./B;

which shows that arithmetic expansion performs with same speed as bsxfun also for indexed matrices, except for small arrays where arithmetic expansion is faster.