1
votes

I have a 3-dimensional vector called 'simulatedReturnsEVT3'. In that vector, I would like to replace all values that are higher than 'MaxAcceptableVal' or lower than 'MinAcceptableVal'. Such values that are beyond either of these two thresholds should be replaced by a random number that is drawn from the 3-dimensional vector 'data2'. For drawing that random number, I use the matlab function 'datasample'.

I have written the below code, which replaces the values that are beyond either of the thresholds with a random number sampled from 'data2'. However, it seems (when plotting the data in a histogram) that the replacement happens with the same value along dimension 'j'. This is not what I want to do. For every threshold exceedance, I want a new random number to be drawn for replacement from 'data2'.

nIndices = 19
nTrials  = 10000

% data2                has dimensions 782 x 19 x 10000
% simulatedReturnsEVT3 has dimensions 312 x 19 x 10000
% MaxAcceptableVal     has dimensions   1 x 19
% MinAcceptableVal     has dimensions   1 x 19

% Cut off Outliers
for i=1:nIndices
    for j=1:nTrials
        sliceEVT = simulatedReturnsEVT3(:,i,j);
        sliceEVT(sliceEVT < MinAcceptableVal(i))=datasample (data2(:,i,j), 1,1,'Replace',false);
        sliceEVT(sliceEVT > MaxAcceptableVal(i))=datasample (data2(:,i,j), 1,1,'Replace',false);
        simulatedReturnsEVT3(:,i,j) = sliceEVT;
    end
end

The same problem can be illustrated on a smaller scale by creating the following matrices.

% Set Maximum Acceptable Levels for Positive and Negative Returns
MaxAcceptableVal = [0.5   0.3]
MinAcceptableVal = [-0.5 -0.3]

simulatedReturnsEVT3 = [0.6 0.3; 0.3 0.3; 0.3 0.3; 0.3 0.4]
simulatedReturnsEVT3 = repmat(simulatedReturnsEVT3,[1 1 2])
data2                = [0.25 0.15; 0.25 0.15; 0.2 0.1]    
data2                = repmat(data2,[1 1 2])               

% Cut off Outliers
for i=1:2 
    for j=1:2 
        sliceEVT = simulatedReturnsEVT3(:,i,j);
        sliceEVT(sliceEVT < MinAcceptableVal(i))=datasample (data2(:,i,j), 1,1,'Replace',false);
        sliceEVT(sliceEVT > MaxAcceptableVal(i))=datasample (data2(:,i,j), 1,1,'Replace',false);
        simulatedReturnsEVT3(:,i,j) = sliceEVT;
    end
end

Can anybody help?

1

1 Answers

1
votes

If I've understood the problem, it seems it is related to the usage of datasample.

In your code you use:

datasample (data2(:,i,j), 1,1,'Replace',false);

in this call, the first "1" defines the number of sample to be extracted that is "1".

In case more than one values have to be replaced in the simulatedReturnsEVT3 matrix, all of them wil be replaced by the same, unique number extracted using datasample

Again, if I've understood the problem, you should call datasample by specifying the number "n" of values are needed to replace the "out of the bound" values in simulatedReturnsEVT3

datasample (data2(:,i,:), n,1,'Replace',false)

To test this solution I've modified the definition of MaxAcceptableVal in order to have "more" values "out of the bound" in simulatedReturnsEVT3:

MaxAcceptableVal = [0.5   0.2]

These are the values of simulatedReturnsEVT3 before the replacement:

val(:,:,1) =

    0.6000    0.3000
    0.3000    0.3000
    0.3000    0.3000
    0.3000    0.4000

val(:,:,2) =

    0.6000    0.3000
    0.3000    0.3000
    0.3000    0.3000
    0.3000    0.4000

These are the values after the replacement:

val(:,:,1) =

    0.2500    0.1500
    0.3000    0.1000
    0.3000    0.1500
    0.3000    0.1000

val(:,:,2) =

    0.2000    0.1000
    0.3000    0.1500
    0.3000    0.1500
    0.3000    0.1000

This is the updated code:

% Set Maximum Acceptable Levels for Positive and Negative Returns
% MaxAcceptableVal = [0.5   0.3]
MaxAcceptableVal = [0.5   0.2]
MinAcceptableVal = [-0.5 -0.3]

simulatedReturnsEVT3 = [0.6 0.3; 0.3 0.3; 0.3 0.3; 0.3 0.4]
simulatedReturnsEVT3 = repmat(simulatedReturnsEVT3,[1 1 2])
data2                = [0.2 0.1; 0.25 0.15; 0.25 0.15; 0.2 0.1]    
data2                = repmat(data2,[1 1 2])               

% Cut off Outliers
for i=1:2 
    for j=1:2 
        sliceEVT = simulatedReturnsEVT3(:,i,j)
% Identify the index of the values to be replaced
        idx=find(sliceEVT < MinAcceptableVal(i))
% Evaluate how many values have to be replaced        
        n=length(idx)
% Extract and assign  the number from "data2"        
        sliceEVT(idx)=datasample (data2(:,i,j), n,1,'Replace',false)

% Identify the index of the values to be replaced
        idx=find(sliceEVT > MaxAcceptableVal(i))
% Evaluate how many values have to be replaced        
        n=length(idx)
% Extract and assign  the number from "data2"        
        sliceEVT(idx)=datasample (data2(:,i,j), n,1,'Replace',false)

        simulatedReturnsEVT3(:,i,j) = sliceEVT
    end
end

Hope this helps.