0
votes

I have a matrix in MATLAB of 50572x4 doubles. The last column has datenum format dates, increasing values from 7.3025e+05 to 7.3139e+05. The question is:

How can I split this matrix into sub-matrices, each that cover intervals of 30 days?

If I'm not being clear enough… the difference between the first element in the 4th column and the last element in the 4th column is 7.3139e5 − 7.3025e5 = 1.1376e3, or 1137.6. I would like to partition this into 30 day segments, and get a bunch of matrices that have a range of 30 for the 4th columns. I'm not quite sure how to go about doing this...I'm quite new to MATLAB, but the dataset I'm working with has only this representation, necessitating such an action.

2
maybe if you could break your problem down into a simpler example, we could help you understand the principle, and then you can scale it up to solve your specific problemKyle Weller
The format is at datenum, right?Werner

2 Answers

0
votes

Note that a unit interval between datenum timestamps represents 1 day, so your data, in fact, covers a time period of 1137.6 days). The straightforward approach is to compare each timestamps with the edges in order to determine which 30-day interval it belongs to:

t = A(:, end) - min(A:, end);            %// Normalize timestamps to start from 0
idx = sum(bsxfun(@lt, t, 30:30:max(t))); %// Starting indices of intervals
rows = diff([0, idx, numel(t)]);         %// Number of rows in each interval

where A is your data matrix, where the last column is assumed to contain the timestamps. rows stores the number of rows of the corresponding 30-day intervals. Finally, you can employ cell arrays to split the original data matrix:

C = mat2cell(A, rows, size(A, 2));       %// Split matrix into intervals
C = C(~cellfun('isempty', C));           %// Remove empty matrices

Hope it helps!

0
votes

Well, all you need is to find the edge times and the matrix indexes in between them. So, if your numbers are at datenum format, one unit is the same as one day, which means that we can jump from 30 and 30 units until we get as close as we can to the end, as follows:

startTime = originalMatrix(1,4);
endTime = originalMatrix(end,4);

edgeTimes = startTime:30:endTime;

% And then loop though the edges checking for samples that complete a cycle:

nEdges = numel(edgeTimes);
totalMeasures = size(originalMatrix,1);

subMatrixes = cell(1,nEdges);

prevEdgeIdx = 0;

for curEdgeIdx = 1:nEdges
  nearIdx=getNearestIdx(originalMatrix(:,4),edgeTimes(curEdgeIdx));
  if originalMatrix(nearIdx,4)>edgeTimes(curEdgeIdx)
    nearIdx = nearIdx-1;
  end
  if nearIdx>0 && nearIdx<=totalMeasures
    subMatrix{curEdgeIdx} = originalMatrix(prevEdgeIdx+1:curEdgeIdx,:);
    prevEdgeIdx=curEdgeIdx;
  else
    error('For some reason the edge was not inbound.');
  end
end

% Now we check for the remaining days after the edges which does not complete a 30 day cycle:

if curEdgeIdx<totalMeasures
  subMatrix{end+1} = originalMatrix(curEdgeIdx+1:end,:);
end

The function getNearestIdx was discussed here and it gives you the nearest point from the input values without checking all possible points.

function vIdx = getNearestIdx(values,point)


if isempty(values) || ~numel(values)
  vIdx = [];
  return
end


vIdx = 1+round((point-values(1))*(numel(values)-1)...
  /(values(end)-values(1)));
  if vIdx < 1, vIdx = []; end
  if vIdx > numel(values), vIdx = []; end
end

Note: This is pseudocode and may contain errors. Please try to adjust it into your problem.