I really wish the documentation was more clear on the use of accumarray for N-D size matrices in Matlab. Regardless, I'm totally confused here and looking for some advice. Here's the challenge:
I have a 3D matrix of data.
- ROWS are individual rivers
- COLUMNS are dates of observations
- PAGES are the time intervals of data collection
For this example, let's assume each observation is volume of water flowing through the meter in a 5 minute interval. However, I now wish to resample the data to intervals of N minutes (obviously multiples of 5). Let's choose NMINS = 15.
So, what I want to do is find the sum or mean of the volume of water over NMINS minute intervals. That is, the ROWS and COLUMNS will not change, but the dimensions and values for the PAGES will be decimated/aggregated.
I can get the grouping values for the pages. That is, the values I need to group by. If it was a single river for a single day, no problem. But, I have hundreds of days and dozens of rivers.
I've gotten this far:
CREATE TEST TIME VECTOR
NMINS = 15; % Bucket by every 15 mins or 20, etc.
N5MINS = 5 * 12 * 24 * 2; % Keep small - Two days of 5 min datenums
dnums = datenum(2016,3,20,0,0:5:N5MINS,0);
% Trim dnums to start at random time for edge case and keep only mins
mins = rem(dnums(25:end-30),1)'; % Create column vector
CREATE RANDOM MATRIX FOR TESTING
rng(123); % Set seed for reproducibility
X = randi(100,12,9,length(mins)); % Test matrix
FIND TIMES IN TERMS OF MINUTES
[~,~,~,H,M] = datevec( mins );
H = 60 .* (H - H(1));
NOW FIND ALL TIMES THAT CORRESPOND TO OUR BUCKET
idxMIN = mod( M+H, NMINS )==0;
idxNewP = idxMIN; % This is used to align the new river matrix
[R,C,P] = size(X); % We'll drop P and use newP
newP = sum(idxNewP); % Number of PAGES in final matrix (new)
% Final newX will have dimensions [R,C,newP]
RESET THE GROUPING INDICES
% Must shift one all as minute intervals represent data UP to that point
% for the actual grouping of data. Test if first bucket is a
% match and set accordingly
if idxMIN(1)
idxMIN = [1; idxMIN(1:end-1)];
subs = cumsum(idxMIN);
else
idxMIN = [0; idxMIN(1:end-1)];
subs = cumsum(idxMIN) + 1 ;
end
ADDITION: The group size will not be consistent and I cannot make that assumption (sadly). Consider the following after running the above.
tsttbl = table();
tsttbl.dnumstr = datestr(mins(1:5));
tsttbl.Mins = M(1:5);
tsttbl.subs = subs(1:5);
tsttbl
Output shows first group is a size of 1:
tsttbl =
dnumstr Mins subs
________ ____ ____
2:00 AM 0 1
2:05 AM 5 2
2:10 AM 10 2
2:15 AM 15 2
2:20 AM 20 3
Last, ultimately, I'll need to pass custom functions. The above is a toy example to illustrate the problem quickly. My apologies for not being more clear.
END ADDITION
And this is where I stumble...
How do I set the subs values to apply along each page to use accumarray? I'm totally confused by the documentation. Or is this actually the wrong approach? For what it's worth I'm using Matlab 2015b.
Honestly, any help would be greatly appreciated.
ALTERNATE SOLUTION This hit me on the way home. Meshgrid is my friend...
Once the cells above have been run (note I changed the size of the matrix X), we create the indices for the entire matrix where the "indices" for the pages (i.e., times) are given by the values in "subs". To do this, I use meshgrid.
[days,rivers,pages] = meshgrid(1:C,1:R,subs);
grpvals = [rivers(:) days(:) pages(:)];
tst = accumarray(grpvals,X(:),[R C newP],@sum);
Probably not the most memory efficient as I have to create essentially the days, rivers, and pages matrices, then wind up creating a new grpvals array of those. But, it has the advantage that now I can use accumarray and pass anonymous functions, @std, etc.
Hope this helps others!
Huge thanks to Luis.