18
votes

I've got MATLAB code inserting n-dimensional points (n >1) into a matrix (myPointMatrix) and am having thoughts about how to insert the first point.

Right now the program checks the size of myPointMatrix before inserting a point. If it is 1x1, myPointMatrix is set equal to the current point. Otherwise the current point is appended. This if-statement is only true once, but is evaluated each time I insert a point, which is very very often.

Removing the if and trying to append to myPointMatrix makes MATLAB understandably complain about matrix dimensions not being consistent. Removing both the if-statement and the inialization of myPointMatrix = 0 causes MATLAB to find myPointMatrix undefined. Also understandable.

How do I initialize myPointMatrix so that I can remove the if-statement? Or is there some other smart solution?

myPointMatrix = 0;
for x=0:limit
    for y=0:limit
        for z=0:limit
            tempPoint = [x y z];
            if (length(myPointMatrix) == 1)
                myPointMatrix = tempPoint;
            else
                myPointMatrix = [myPointMatrix; tempPoint];
            end
        end
    end
end
6

6 Answers

29
votes

There are several ways to append a matrix or vector to any matrix, empty or not. Much depends upon the size of the matrix, and how often you will do the append. (Note that sparse matrices are an entirely different animal. They need to be deal with separately.)

The simple scheme would use concatenation. For example, I'll create a random array. While I know that one call to rand would be the proper solution here, I'm doing it only for comparison purposes.

n = 10000;
tic
A = [];
for i = 1:n
  Ai = rand(1,3);
  A = [A;Ai];
end
toc

Elapsed time is 9.537194 seconds.

See that the time required was reasonably high, far more than had I just called rand directly.

tic,rand(n,3);toc
Elapsed time is 0.008036 seconds.

Other ways to append are similar in time. For example, you can append by indexing too.

A = [];
A(end+1,:) = rand(1,3);
A
A =
      0.91338      0.63236      0.09754

This will be similar in terms of time to appending via concatenation. An interesting fact to understand is that appending new rows to an array is subtly different than appending new columns. It takes slightly more time to append a row than a column. This is because of the way elements are stored in MATLAB. Appending a new row means the elements must actually be shuffled in memory.

A = zeros(10000,3);
B = zeros(3,10000);

tic,for i = 1:100,A(end+1,:) = rand(1,3);end,toc
Elapsed time is 0.124814 seconds.

tic,for i = 1:100,B(:,end+1) = rand(3,1);end,toc
Elapsed time is 0.116209 seconds.

The problem with any append operation at all is that MATLAB must reallocate the memory required for A, and do so EVERY time the matrix grows in size. Since the size of A grows linearly, the overall time required grows quadratically with n. So were we to double the size of n, the dynamically grown A will take four times as long to build. This quadratic behavior is why people tell you to pre-allocate your MATLAB arrays when they will be grown dynamically. In fact, if you look at the mlint flags in the editor, MATLAB warns you when it sees this happening.

A better solution, if you know the final size of A, is to pre-allocate A to its final size. Then just index in.

tic
A = zeros(n,3);
for i = 1:n
  A(i,:) = rand(1,3);
end
toc

Elapsed time is 0.156826 seconds.

While this is much better than the dynamically grown array, it is still FAR worse than a vectorized use of rand. So wherever possible, use the vectorized form of functions like this.

The problem is, sometimes you simply do not know how many elements you will end up with. There are still several tricks one can use to avoid the nasty quadratic growth.

One trick is to make a guess at the final size of A. Now, use indexing to insert new values into A, but watch carefully for when the new entries will spill over the bounds of A. When this is just about to happen, DOUBLE the size of A, appending one big block of zeros to the end. Now return to indexing new elements into A. Keep a separate count of how many elements have been "appended". At the very end of this process, delete the unused elements. This avoids much of the nasty quadratic behavior, since only a few append steps will ever be done. (Remember, you are doubling the size of A when you must do an append.)

A second trick is to use pointers. While MATLAB does not really offer much capability in the way of pointers, a cell array is a step in that direction.

tic
C = {};
for i = 1:n
  C{end+1} = rand(1,3);
end
A = cat(1,C{:});
toc

Elapsed time is 3.042742 seconds.

This took less time to accomplish than the grown array. Why? We were only building an array of pointers to the cells. A nice thing about this is if each append step has a variable number of rows, it still works nicely.

A problem with the cell array, is it is not terribly efficient when there are MILLIONS of elements to append. It is still a quadratic operation after all, because we are growing the array of pointers at each step.

A solution to that problem is to use an amalgam of the two styles shown above. Thus, define each cell of the cell array to be moderately large in size. Now use indexing to stuff new rows of A into the cell. When the current cell must be grown larger by the next append step, just add a new cell to the cell array.

Some years ago, this discussion arose on the MATLAB newsgroup, and several solutions along these lines were proposed. I posted the solutions growdata & growdata2 as files on the MATLAB Central File Exchange. Growdata2 used function handles to solve the problem:

tic
Ahandle = growdata2;
for i = 1:n
  Ahandle(rand(1,3))
end
% unpack the object into a normal array
A = Ahandle();
toc

Elapsed time is 1.572798 seconds.

At the time, it was a somewhat faster approach to use persistent variables.

tic
growdata
for i = 1:n
  growdata(rand(1,3))
end
A = growdata;
toc

Elapsed time is 2.048584 seconds.

Since then the implementation of function handles has clearly improved in MATLAB, so the function handle is now faster.

A virtue of these schemes is they will not have a quadratic performance penalty, while allowing millions of append steps.

Oh well, this is surely more information than was originally requested when the question was asked. Perhaps someone will get something out of it though.

14
votes

Use myPointMatrix = []; to initialize the matrix.

The bigger myPointMatrix is, the slower appending will be. It gets slower and slower, since for each time you append a point matlab allocates a new matrix of the new size and copies the information from your old matrix + your new point into the new matrix.

It is then better to initialize MyPointMatrix with its final size, and inserting the points into given positions in the matrix.

3
votes

Your best option is to pre-alocate the matrix and use a loop variable. This should be significantly faster.

limit = 9;
myPointMatrix = nan((limit+1)^3,3);

loopVar = 1;
for x=0:limit
    for y=0:limit
        for z=0:limit
            myPointMatrix(loopVar,:) = [x y z];
            loopVar = loopVar + 1;
        end
    end
end
0
votes

I believe the solution you are looking for is to initialize the myPointMatrix to a matrix with 0 lines and 3 columns, i.e.

myPointMatrix = zeros(0, 3);

Then the first assignment

myPointMatrix = [myPointMatrix; tempPoint];

will work correctly, as well as the subsequent ones. An equivalent way to write the assignment is

myPointMatrix(end+1,:) = tempPoint;

However keep in mind than growing a matrix like that is not efficient and, as AnnaR says, initializing myPointMatrix with ifs final size, if known, is a better solution.

0
votes

This is what you need

myPointMatrix=[];
for x=0:limit
for y=0:limit
for x=0:limit
  myPointMatrix(:,end+1)=[x y z];
end
end
end

but only in the case that you do some nonlinear operation with [ x y z ], before you assign it. If not then you can write the above lines as follows:

myPointMatrix=[];
myPointMatrix(1,:)=kron([1:limit],ones(1,limit^2));
myPointMatrix(2,:)=kron([1:limit^2],ones(1,limit));
myPointMatrix(3,:)=kron(ones(1,limit^2),[1:limit]);

The above is fully vectorised, although one might want to edit kron.m and replace some find with logical ... but you can do that yourself I suppose... :D

0
votes
%appending to matlab array "f":

lfg=[697 770 852 941];
hfg=[1209 1336 1477];
f=[];
for i=1:4,
    for j=1:3,
        %f = [ f [lfg(i);hfg(j)] ];
        append( f , [lfg(i);hfg(j)] );
    end
end
f