0
votes

I’m working in R and exploring the use of caret for variable selection and weighting using several methods. Here I’m exploring using forward stepwise and least angular regression (LARS), using tuning parameters for each. In the code below, I’ve arbitrarily chosen a dependent variable (y) and a subset of predictors (x’s) and have ran them via training algorithms using a subset of 70% of the data. To do so, I’m applying repeated 10-fold cross validations. What I’m struggling with is locating a command to identify the final model parameters (e.g., intercept, beta weights) derived from the train function. I’m not readily seeing it when I call object$finalModel. Is there a way to recover these in R using the methods listed (forward stepwise regression and LARS)? I feel like this would have to exist....

Thanks!

library (caret)
library(AppliedPredictiveModeling)
data(abalone)
str(abalone)

set.seed(18)
inTrain <- sample(1:(round(nrow(abalone)*.7)),replace=FALSE)

train_df <- abalone [inTrain,]
test_df <- abalone [-inTrain,]

#predicting Diameter using several of the dataset's variables#
train_df_x <- train_df [,4:8]
test_df_x <- test_df [,4:8]
y_train <- train_df [,3]
y_test <- test_df  [,3]

set.seed(18)
fold.ids <- createMultiFolds(y_train,k=10,times=3)
fitControl <- trainControl(method = "repeatedcv",
                           number = 10,
                           repeats = 3,
                           returnResamp = "final",
                           index = fold.ids,
                           summaryFunction = defaultSummary,
                           selectionFunction = "oneSE")

### Forward regression ###
library(leaps)
forwardLmGrid <- expand.grid (.nvmax=seq(2,5))
set.seed(18)
F_OLS_fit <- train(train_df_x, y_train,"leapForward",trControl = fitControl,metric="RMSE", tuneGrid=forwardLmGrid)

### LARS ###
larGrid <- expand.grid(.fraction=seq(.01,.99,length=50))
library(lars)
Lar_fit <- train(train_df_x, y_train,"lars",trControl = fitControl,metric="RMSE", tuneGrid=larGrid)
1

1 Answers

0
votes

I'll show you how I do it with an example:

library(data.table)
n <- 1000
x1 <- runif(n,min=-10,max=10)
x2 <- runif(n,min=-10,max=10)
x3 <- runif(n,min=-10,max=10)
x4 <- runif(n,min=-10,max=10)
x5 <- runif(n,min=-10,max=10)
y1 <- 30 + x1 + 4*x2 + x3
synthetic <- data.table(x1=x1,x2=x2,x3=x3,x4=x4,x5=x5,y=y1)
library(caret)
library(lars)
ctrl <- trainControl(method = "cv", savePred=T, number=3)
fractionGrid <- expand.grid (fraction=seq(0,1,(1/(ncol(widedt)-1))))
cvresult <- train(y~.,
                  data=synthetic,
                  method = "lars",
                  trControl = ctrl,
                  metric="RMSE",
                  tuneGrid=fractionGrid,
                  use.Gram=FALSE)
coeffs <- predict.lars(cvresult$finalModel,type="coefficients")
models <- as.data.table(coeffs$coefficients)
winnermodelscoeffs <- models[which(coeffs$fraction==cvresult$bestTune$fraction)]