1
votes

I have the following code:

colBIN = {0.050, 0.055, 0.060, 0.065, 0.070, 0.075, 0.080, 0.085, 0.090, 0.095,0.1};

for i = 1 : length(colBIN)-1
    colBIN{i,2} = find(cols(:,1) <= cell2mat(colBIN(i+1,1)) & cols(:,1) > cell2mat(colBIN(i,1)));
end

rowBIN = {0.045, 0.046, 0.047, 0.048, 0.049, 0.050, 0.051, 0.052};

for i = 1 : length(rowBIN)-1
    rowBIN{i,2} = find(rows(:,1) <= cell2mat(rowBIN(i+1,1)) & rows(:,1) > cell2mat(rowBIN(i,1))); 
end

binCombos = cell(length(rowBIN)-1,length(colBIN)-1);

for m = 1 : length(rowBIN)-1
    for n = 1 : length(colBIN)-1
        binCombos{n,m} = intersect( rowBIN{m,2}(:,1),colBIN{n,2}(:,1));
    end
end


binRows = size(binCombos,1);
binCols = size(binCombos,2)-1;

j = j + 1;
for n = 1 : binRows; 
    for m = 1 : binCols;
       thisBin = binCombos{n,m}(:,:); 
       if isempty(thisBin)==0

       %polyfit
       quadmod = polyfit(x_vrbl(thisBin), y_vrbl(thisBin), 2);
       interval = 0.0:0.001:1;
       quadmodcurve = polyval(quadmod,interval); 
       [r2 rmse] = rsquare(y_vrbl(thisBin), quadmodcurve); 
       plot(x_vrbl(thisBin), y_vrbl(thisBin), '*', interval, quadmodcurve);
       xlabel('x_vrbl');
       ylabel('y_vrbl');
       axis([0,1,0,1]);
       header = ['R^2 =' num2str(r2),'coeffs:',num2str(quadmod)];
       title(header);
       saveas(gcf, sprintf('plot_%d.pdf', j));

       %residuals
       res = y_vrbl(thisBin) - quadmodcurve;
       plot(x_vrbl(thisBin),res,'+');
       header2 = ['residuals'];
       title(header2);
       saveas(gcf, sprintf('residuals_%d.pdf', j));

       end
       j = j + 1;
   end
end

Explanation/Problem:

binCombos is a 2 dimensional cell array and each cell has a non-uniform number of data points. I am fitting a quadratic curve to the data for each unique cell, and trying (unsuccessfully) to output the R^2 value and also plot residuals.

I think the problem is related to the fact that the 'interval' required for the polyval function does not match the array size for y_vrbl(thisBin) when trying to find rsquare, and likewise for calculating the residuals. For example, if I set interval = x_vrbl(thisBin) then the residuals "work" but the polyfit is all messed up.

2
As far as I can tell without being able to run this code, it should be correct the way you do it originally. What exactly is the problem? Do you get an error message?A. Donda
On second look: What does rsquare do? Can you post your implementation?A. Donda
This is the routine I used for r^2: mathworks.com/matlabcentral/fileexchange/…acmyers

2 Answers

0
votes

My guess is that this should work:

quadmodcurve = polyval(quadmod,y_vrbl(thisBin)); 
[r2 rmse] = rsquare(y_vrbl(thisBin), quadmodcurve);
interval = 0.0:0.001:1;
quadmodcurve = polyval(quadmod,interval); 

In order to determine the quality of fit, you have to evaluate the polynomial only at the x-values of the sample. In order to plot the full polynomial graph, you need to evaluate it at more and regularly spaced x-values.

0
votes

I managed to get your code running with the data at http://dropproxy.com/f/4B6 and with the rsquare function from the file exchange after correcting some errors:

d = importdata('sample_data.xlsx');
y_vrbl = d.data(:, 1);
x_vrbl = d.data(:, 2);
rows = d.data(:, 3);
cols = d.data(:, 4);

cb = {0.050, 0.055, 0.060, 0.065, 0.070, 0.075, 0.080, 0.085, 0.090, 0.095,0.1};

for i = 1 : length(cb)-1
    colBIN{i,2} = find(cols(:,1) <= cell2mat(cb(i+1)) & cols(:,1) > cell2mat(cb(i)));
end

rb = {0.045, 0.046, 0.047, 0.048, 0.049, 0.050, 0.051, 0.052};

for i = 1 : length(rb)-1
    rowBIN{i,2} = find(rows(:,1) <= cell2mat(rb(i+1)) & rows(:,1) > cell2mat(rb(i)));
end

binCombos = cell(length(rowBIN)-1,length(colBIN)-1);

for m = 1 : length(rowBIN)-1
    for n = 1 : length(colBIN)-1
        binCombos{n,m} = intersect( rowBIN{m,2}(:,1),colBIN{n,2}(:,1));
    end
end


binRows = size(binCombos,1);
binCols = size(binCombos,2)-1;

j = 1;
for n = 1 : binRows;
    for m = 1 : binCols;
        thisBin = binCombos{n,m}(:,:);
        if ~isempty(thisBin)

            % polyfit
            quadmod = polyfit(x_vrbl(thisBin), y_vrbl(thisBin), 2);

            % compute residuals and R²
            quadmodcurve = polyval(quadmod,y_vrbl(thisBin));
            [r2, rmse] = rsquare(y_vrbl(thisBin), quadmodcurve);
            res = y_vrbl(thisBin) - quadmodcurve;

            % plot fit
            interval = 0.0:0.001:1;
            quadmodcurve = polyval(quadmod,interval);
            plot(x_vrbl(thisBin), y_vrbl(thisBin), '*', interval, quadmodcurve);
            xlabel('x_vrbl');
            ylabel('y_vrbl');
            axis([0,1,0,1]);
            header = ['R^2 =' num2str(r2),'coeffs:',num2str(quadmod)];
            title(header);
            saveas(gcf, sprintf('plot_%d.pdf', j));

            % plot residuals
            plot(x_vrbl(thisBin),res,'+');
            header2 = ['residuals'];
            title(header2);
            saveas(gcf, sprintf('residuals_%d.pdf', j));

        end
        j = j + 1;
    end
end

The fits look good to me, except that in most cases a linear function might be enough and the quadratic term is not necessary.

Regarding your remaining question: I'm not an expert on using R² for nonlinear fits (see note 2 on Coefficient of determination), but the implementation you are using seems a bit fishy to me. The reason you get an output of 0 most of the time is the max function on line 65 of rsquare.m which prevents negative values from being returned. Since your polynomial fit does contain a constant term, calling the function as

[r2, rmse] = rsquare(y_vrbl(thisBin), quadmodcurve, false);

seems to be more appropriate, and results in R² > 0.9 in most cases.

My recommendation: Check whether R² is the correct measure of goodness of fit in your case, and check whether that function implements it correctly. Functions coming with Matlab can be trusted out of the box, but there is no quality guarantee for posts on the Matlab File Exchange.