7
votes

I have the file data.txt with two columns and N rows, something like this:

0.009943796 0.4667975
0.009795735 0.46777886
0.009623984 0.46897832
0.009564759 0.46941447
0.009546991 0.4703958
0.009428543 0.47224948
0.009375241 0.47475737
0.009298249 0.4767201
[...]

Every couple of values in the file correspond to one point coordinates (x,y). If plotted, this points generate a curve. I would like to calculate the area under curve (AUC) of this curve.

So I load the data:

data = load("data.txt");
X = data(:,1);
Y = data(:,2);

So, X contains all the x coordinates of the points, and Y all the y coordinates.

How could I calculate the area under curve (AUC) ?

6
It depends. Is the trapezoidal rule good enough for you?Oliver Charlesworth
@Robert: that looks like it's the area under the curve of a function (Matlab has a whole bunch of quadxxxx() functions). OP is looking for numerical integration of data.Jason S

6 Answers

4
votes

Easiest way is the trapezoidal rule function trapz.

If your data is known to be smooth, you could try using Simpson's rule, but there's nothing built-in to MATLAB for integrating numerical data via Simpson's rule. (& I'm not sure how to use it for x/y data where x doesn't increase steadily)

4
votes

just add AUC = trapz(X,Y) to your program and you will get the area under the curve

1
votes

You can do something like that:

AUC = sum((Y(1:end-1)+Y(2:end))/2.*...
  (X(2:end)-X(1:end-1)));
1
votes

Source: Link

An example in MATLAB to help you get your answer ...

x=[3 10 15 20 25 30];
y=[27 14.5 9.4 6.7 5.3 4.5];
trapz(x,y)

In case you have negative values in y, you can do like,

y=max(y,0)
1
votes

[~,~,~,AUC] = perfcurve(labels,scores,posclass);

% posclass might be 1

http://www.mathworks.com/matlabcentral/newsreader/view_thread/252131

0
votes

There are some options to trapz for the person ready to do some coding by themselves. This link shows the implementation of Simpson's rule, with python code included. There is also a File Exchange on simpsons rule.