Is pROC area under the curve incorrect?

Question

My current problem : I used caret package to generate classification prediction models and I meant to validate my models with specific metric (AUC ROC). AUC metric is available to train model with training set (internal validation) but NOT to predict (external validation).

1. Internal validation :

Fit <- train(X, Y$levels, method= "svmRadial", trControl = fitControl, tuneLength = 20, metric = "ROC")

Results :

sigma C ROC Sens Spec ROCSD SensSD SpecSD 0.0068 2.00 0.83 0.82 0.57 0.149 0.166 0.270

2. External Validation :

In order to access to external validation AUC, I tried to predict my training set and calculate directly this metric with pROC.

predictions <- as.vector(predict(Fit$finalModel, newdata = X)) data <- data.frame(pred=as.numeric(predictions),obs=as.numeric(Y$levels)) pROC::roc(data$pred, data$obs)

Results : Area under the curve: 0.9057

3. Conclusion :

Results : AUC(internal validation) != AUC(external validation) whereas I used same data (training set) to check my ROC external validation criterion. In the best case, I should be able to obtain a maximum value of 0.83. However, it would seem very odd to me that AUC(internal validation) < AUC(external validation).

I have no idea to solve this enigma (8-/ skeptical) . All assistance is welcome.

I would suggest creating some tabular summaries of the data, get a couple of plots that illustrate your problem, and then post this question not here, but on the Stack Exchange called "Cross Validated". Those people specialize in this sort of problem. Although a lot of those people hang out here too, you have not given us much to go on. — Mike Wise
How different exactly? Because AUC in the training set will be higher than in the test set "with high probability", but that doesn't mean that it cannot happen. Now, if its reaaally different, then something funky may be going on. — mbiron
Difference = 0.08 (~10%). I tried pROC, ROCR and puSummary (function presented here (github.com/benmack/oneClass/blob/master/R/puSummary.R)). All give me the same results (AUC = 0.9057) — B.Gees
If you tried to predict your training set then you should expect unrealistically high AUC values. The internal validation is probably the average of some n-fold cross-validation training/test iterations, with the AUC always done on portions of the data that that step did not train on, i.e. the test subset for step. — Mike Wise
Yes, it looks to me like you are getting the expected results. Try holding 10-20 percent of your data out of the training and predict on that. I bet you get close to your "internal validation" result (not actually familiar with that term). — Mike Wise

Mike Wise Mike Wise · Accepted Answer · 2016-01-22T16:03:05

So your results are to be expected. In general the "internally validated" AUCs are created by using test cases that were separate from the training cases whereas in your "external validation" you are testing with the same cases you trained on (which is cheating of course). So the internally validated AUC will be expected to be smaller than the externally validated AUC. I think the following diagram should make that clear:

Is pROC area under the curve incorrect?

1 Answers