0
votes

I'm really trying to understand why two pieces of code don't produce identical models. To create the first neural network (NN1), I used (code below) cross validation in the train function of the Caret package to find the best parameters. Page 2 of the package's vignette suggests that it will "Fit the final model to all the training data using the optimal parameter set".

So in the code below I expect NN1 to reflect the full training set with the best parameters which happen to be size=5 and decay=0.1.

My plan was to use the parameters from this step to create a model to put into production using the combined training and test data. Before I created this production model, I wanted to make sure I was using the output from the train function properly.

So I created a second model (NN2) with with the train function but without tuning. Instead I specified the parameters size=5 and decay=0.1. With the same data, same parameters (and same seed), I expected identical models, but they are not. Why aren't these models identical?

# Create some data
library(caret)
set.seed(2)
xy<-data.frame(Response=factor(sample(c("Y","N"),534,replace = TRUE,prob=c(0.5,0.5))),
               GradeGroup=factor(sample(c("G1","G2","G3"),534,replace=TRUE,prob=c(0.4,0.3,0.3))),
               Sibling=sample(c(TRUE,FALSE),534,replace=TRUE,prob=c(0.3,0.7)),
               Dist=rnorm(534))

xyTrain <- xy[1:360,]
xyTest <- xy[361:534,]

# Create NN1 using cross-validation
tc <- trainControl(method="cv", number = 10, savePredictions = TRUE, classProbs = TRUE)
set.seed(2)
NN1 <- train(Response~.,data=xyTrain,
             method="nnet",
             trControl=tc,
             verbose=FALSE,
             metric="Accuracy")

# Create NN2 using parameters from NN1
fitControl <- trainControl(method="none", classProbs = TRUE)
set.seed(2)
NN2 <- train(Response~.,data=xyTrain,
             method="nnet",
             trControl=fitControl,
             verbose=FALSE,
             tuneGrid=data.frame(size=NN1$bestTune[[1]],decay=NN1$bestTune[[2]]),
             metric="Accuracy")

Here are the results

> # Parameters of NN1
> NN1$bestTune
  size decay
1    1     0
> 
> # Code to show results of NN1 and NN2 differ
> testFitted <- data.frame(fitNN1=NN1$finalModel$fitted.values,
+                          fitNN2=NN2$finalModel$fitted.values)
> 
> testPred<-data.frame(predNN1=predict(NN1,xyTest,type="prob")$Y,
+                      predNN2=predict(NN2,xyTest,type="prob")$Y)
> # Fitted values are different
> head(testFitted)
      fitNN1    fitNN2
X1 0.4824096 0.4834579
X2 0.4673498 0.4705441
X3 0.4509407 0.4498603
X4 0.4510129 0.4498710
X5 0.4690963 0.4753655
X6 0.4509160 0.4498539
> # Predictions on test set are different
> head(testPred)
    predNN1   predNN2
1 0.4763952 0.4784981
2 0.4509160 0.4498539
3 0.5281298 0.5276355
4 0.4512930 0.4498993
5 0.4741959 0.4804776
6 0.4509335 0.4498589
> 
> # Accuracy of predictions are different
> sum(predict(NN1,xyTest,type="raw")==xyTest$Response)/nrow(xyTest)
[1] 0.4655172
> sum(predict(NN2,xyTest,type="raw")==xyTest$Response)/nrow(xyTest)
[1] 0.4597701
> 
> # Summary of models
> summary(NN1)
a 4-1-1 network with 7 weights
options were - entropy fitting 
 b->h1 i1->h1 i2->h1 i3->h1 i4->h1 
 -8.38   6.58   5.51  -9.50   1.06 
 b->o h1->o 
-0.20  1.39 
> summary(NN2)
a 4-1-1 network with 7 weights
options were - entropy fitting 
 b->h1 i1->h1 i2->h1 i3->h1 i4->h1 
 10.94  -8.27  -7.36   8.50  -0.76 
 b->o h1->o 
 3.15 -3.35 
1

1 Answers

1
votes

I believe this has to do with the random seed. When you do cross-validation you are fitting many models from that starting seed (set.seed(2)). The final model is fit with the same parameters but the seed that the final model was fit inside cross-validation is not the same as when you try to just fit that final model with those parameters yourself. You see this here because the weights in each neural network call (nnet) are randomly generated each time.