I'm trying to implement some functions to compare five different machine learning models to predict some values in a regression problem.
My intention is working on a suit of functions that could train the different codes and organize them in a suit of results. The models I select by instance are: Lasso, Random Forest, SVM, Linear Model and Neural Network. To tune some models I intend to use the references of Max Kuhn: https://topepo.github.io/caret/available-models.html. However, since each model requires different tuning parameters, I'm in doubt how to set them:
First I set up the grid to 'nnet' model tunning. Here I selected different number of nodes in hidden layer and the decay coefficient:
my.grid <- expand.grid(size=seq(from = 1, to = 10, by = 1), decay = seq(from = 0.1, to = 0.5, by = 0.1))
Then I construct the functions that will run the five models 5 times in a 6-fold configuration:
my_list_model <- function(model) {
set.seed(1)
train.control <- trainControl(method = "repeatedcv",
number = 6,
repeats = 5,
returnResamp = "all",
savePredictions = "all")
# The tunning configurations of machine learning models:
set.seed(1)
fit_m <- train(ST1 ~.,
data = train, # my original dataframe, not showed in this code
method = model,
metric = "RMSE",
preProcess = "scale",
trControl = train.control
linout = 1 # linear activation function output
trace = FALSE
maxit = 1000
tuneGrid = my.grid) # Here is how I call the tune of 'nnet' parameters
return(fit_m)
}
Lastly, I execute the five models:
lapply(list(
Lass = "lasso",
RF = "rf",
SVM = "svmLinear",
OLS = "lm",
NN = "nnet"),
my_list_model) -> model_list
However, when I run this, it shows:
Error: The tuning parameter grid should not have columns fraction
By what I understood, I didn't know how to specify very well the tune parameters. If I try to throw away the 'nnet' model and change it, for example, to a XGBoost model, in the penultimate line, it seems it works well and results would be calculated. That is, it seems the problem is with the 'nnet' tuning parameters.
Then, I think my real question is: how to configure these different parameters of models, in special the 'nnet' model. In addition, since I didn't need to set up the parameters of lasso, random forest, svmLinear and linear model, how were they tuned by the caret package?