4
votes

I have a relatively big matrix (800'000 x 1'000) which contains NaNs at the end of some columns and I need to get rid of them while moving up each cells. When I remove a NaN the next cell should move up. I don't get to move the values of the next column into the cells after the non NaN values of the column right before. It is important that the number of rows remains the same as the initial matrix (which I fix) but the number of columns will obviously change.

Here is an example on a smaller matrix A1 4x5:

A1 = [
     1     5     8     9    11
     2     6   NaN    10    12
     3     7   NaN   NaN    13
     4   NaN   NaN   NaN  NaN  ]

I need A1 to become:

A2 = [
     1     5     9    13
     2     6    10    NaN
     3     7    11    NaN
     4     8    12    NaN   ]

In this example A1(1,3)=8 moved to A2(4,2)=8, A1(1,4)=9 moved to A2(1,3)=9, A1(2,4)=10 moved to A2(2,3)=10 and so on. The number of rows is still 4 but the number of columns becomes 4. The NaN cells in the last columns are needed to avoid 'matrix dimension mismatch error' but I do not need it so, after that (or at the same time), I should get rid of the last column of the matrix which may still contain NaNs. Finally, my matrix should become:

A3 = [
     1     5     9 
     2     6    10  
     3     7    11  
     4     8    12  ]

I tried to use A1(~isnan(A1)) but this command put the values in a single column vector, while I need to still have a matrix of predetermined number of rows or at least a cell array which contains each column of the matrix A3 in each cell array.

Is there a way to go from A1 to A3?

3

3 Answers

2
votes

What you'd have to do is to first filter out the NaN's, then reshape the remaining data. Try this:

reshape(A1(isfinite(A1)),4,[])

You might need to tweak this a bit, but I think it'll do what you want in a single step.

I'm not sure if the replace operator will work with missing values like this, however, so you might need something like this:

A2=A1(isfinite(A1))
A3=reshape(A2(1:(4*floor(length(A2)/4))),4,[])
0
votes

Here's a very straightforward approach - there's almost certainly a more efficient way of doing it, given the amount of copying going on here.

vals = A1(~isnan(A1));
A2 = NaN(size(A1));
A2(1:length(vals)) = vals;
A3 = A2(:,~any(isnan(A2)));
0
votes
n = size(A1,1); %// number of rows
A1 = A1(~isnan(A1)); %// linearize to a column and remove NaN's
A1 = A1(1:floor(numel(A1)/n)*n); %// trim last values if needed
A1 = reshape(A1,n,[]); %// put into desired shape