I REALLY like tidymodels, but I'm unclear how I could fit that model workflow on something like a nested group by. As an example, tidyr outlines a simple nest on something like cylinder from mtcars, and then fits a unique linear reg model to each cylinder. I'm trying to fit hundreds of unique models (likely a random forest) based on something like cylinder, but using the tidymodels workflow (data split, recipe, predict).
Here's what is outlined on the tidyr page as a simple nest/fit linear reg:
mtcars_nested <- mtcars %>%
group_by(cyl) %>%
nest()
mtcars_nested <- mtcars_nested %>%
mutate(model = map(data, function(df) lm(mpg ~ wt, data = df)))
mtcars_nested
Is there a way to do something like the below, but based on a group_by or nest attribute in a column? The predictions and/or accuracy would then need to be combined for each and stored in one dataframe, if possible. I tried feeding the data split a nested dataframe, and it didn't work. I feel like this is a purrr map question, but am unclear if it's something tidymodels already supports:
library(tidymodels)
library(tidyverse)
#add dataset
mtcars <- mtcars
#create data splits
split <- initial_split(mtcars)
mtcars_train <- training(split)
mtcars_test <- testing(split)
#create recipe
mtcars_recipe <-
recipe(mpg ~., data = mtcars_train) %>%
step_normalize(all_predictors())
#define model
lm_mod <-
linear_reg(mode = "regression") %>%
set_engine("lm")
#create workflow that combines recipe & model
mtcars_workflow <-
workflow() %>%
add_model(lm_mod) %>%
add_recipe(mtcars_recipe)
#fit workflow on train data
mtcars_fit <-
fit(mtcars_workflow, data = mtcars_train)
#predict on test data
predictions <-
predict(mtcars_fit, mtcars_test)
Appreciate help/advice/direction.