1
votes

I am wanting to build a neural network classifier using the caret package. I have specified a tunegrid with some hyper-parameters that I want to test to get the best accuracy.

After I run the model, the train function function will always default to the standard decay and size values. Is this a bug within caret? or is there an issue with my code?

Code:

nnet_grid <- expand.grid(.decay = c(0.5, 0.1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7), .size = c(3, 5, 10, 20))

features.nn <- train(label ~ .,
                      method     = "nnet",
                      trControl  = trControl,
                      data       = features,
                      tunegrid = nnet_grid,
                      verbose = FALSE)

Output:

No pre-processing
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 1680, 1680, 1680, 1680, 1680 
Resampling results across tuning parameters:

  size  decay  Accuracy    Kappa 
  1     0e+00  0.10904762  0.0645
  1     1e-04  0.10142857  0.0565
  1     1e-01  0.14380952  0.1010
  3     0e+00  0.09571429  0.0505
  3     1e-04  0.05523810  0.0080
  3     1e-01  0.19190476  0.1515
  5     0e+00  0.13000000  0.0865
  5     1e-04  0.14761905  0.1050
  5     1e-01  0.31809524  0.2840

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were size = 5 and decay = 0.1.
1

1 Answers

1
votes

You provided the wrong argument, it should be tuneGrid = instead of tunegrid = , so caret interprets this as an argument for nnet and selects its own grid

With the grid you see above, caret will choose the model with the highest accuracy and from the results provided, it is size=5 and decay=0.1, with the highest accuracy of 0.318.

To do it with the grid you defined,using an example:

data = MASS::Pima.tr

nnet_grid <- expand.grid(
decay = c(0.5, 1e-2, 1e-3),
size = c(3,5,10,20))

set.seed(123)
nn <- train( type ~ .,
                      method     = "nnet",
                      trControl  = trainControl(method="cv",10),
                      data       = data,
                      tuneGrid = nnet_grid,
                      verbose = FALSE)

Here you can see another parameter was chosen, but the difference is quite small if you look at the results for accuracy:

        Neural Network 

200 samples
  7 predictor
  2 classes: 'No', 'Yes' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 179, 180, 180, 181, 180, 180, ... 
Resampling results across tuning parameters:

  decay  size  Accuracy   Kappa    
  0.001   3    0.7211153  0.3138427
  0.001   5    0.6253008  0.1391728
  0.001  10    0.6948747  0.2848068
  0.001  20    0.6546366  0.2369800
  0.010   3    0.7103509  0.3215962
  0.010   5    0.6861153  0.2861830
  0.010  10    0.6596115  0.2438720
  0.010  20    0.6448496  0.1722412
  0.500   3    0.6403258  0.1484703
  0.500   5    0.6603258  0.1854491
  0.500  10    0.6603509  0.1896705
  0.500  20    0.6400877  0.1642272

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were size = 3 and decay = 0.001.

Not very sure if you scaled your data, but usually you need it, see post:

nn <- train( type ~ .,
              method     = "nnet",
              trControl  = trainControl(method="cv",10),
              data       = data,
              tuneGrid = nnet_grid,
              preProcess = c("center","scale"),
                          verbose = FALSE)

Neural Network 

200 samples
  7 predictor
  2 classes: 'No', 'Yes' 

Pre-processing: centered (7), scaled (7) 
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 180, 180, 180, 179, 180, 180, ... 
Resampling results across tuning parameters:

  decay  size  Accuracy   Kappa    
  0.001   3    0.7158772  0.3699193
  0.001   5    0.6653759  0.2586270
  0.001  10    0.6458772  0.2193141
  0.001  20    0.6606140  0.2648904
  0.010   3    0.6945865  0.3465460
  0.010   5    0.6706140  0.2479049
  0.010  10    0.6651128  0.2433722
  0.010  20    0.6858521  0.2918013
  0.500   3    0.7403759  0.4060926
  0.500   5    0.7453759  0.4154149
  0.500  10    0.7553759  0.4345907
  0.500  20    0.7553759  0.4275870