0
votes

I have a dataframe that includes the lower and upper bound of a few parameters for each category of fruit. It looks sth like this:

+----------+-----------+-------+-------+
| Category | Parameter | Upper | Lower |
+----------+-----------+-------+-------+
| Apple    | alpha     | 10    | 20    |
+----------+-----------+-------+-------+
| Apple    | beta      | 20    | 30    |
+----------+-----------+-------+-------+
| Orange   | alpha     | 10    | 20    |
+----------+-----------+-------+-------+
| Orange   | beta      | 30    | 40    |
+----------+-----------+-------+-------+
| Orange   | gamma     | 50    | 60    |
+----------+-----------+-------+-------+
| Pear     | alpha     | 10    | 30    |
+----------+-----------+-------+-------+
| Pear     | beta      | 20    | 40    |
+----------+-----------+-------+-------+
| Pear     | gamma     | 20    | 30    |
+----------+-----------+-------+-------+
| Banana   | alpha     | 40    | 50    |
+----------+-----------+-------+-------+
| Banana   | beta      | 20    | 40    |
+----------+-----------+-------+-------+

I have wrote a function where I pass in this data frame, the fruit name, and the desired length for my sequence:

library(purrr)

param_grid <- function(df, fruit, length) {
  df_fruit <- df %>%
    filter(Category == fruit) 
  
  map2(df_fruit$Upper, df_fruit$Lower, seq, length.out = length) %>%
    set_names(df_fruit$Parameter) %>%
    cross_df()
}

Output

param_grid(df, "Apple", length=100)

# A tibble: 10,000 x 2
   alpha  beta
   <dbl> <dbl>
 1  10      20
 2  10.1    20
 3  10.2    20
 4  10.3    20
 5  10.4    20
 6  10.5    20
 7  10.6    20
 8  10.7    20
 9  10.8    20
10  10.9    20
# … with 9,990 more rows

Output

param_grid(df, "Orange", length=100)

# A tibble: 1,000,000 x 3
   alpha  beta gamma
   <dbl> <dbl> <dbl>
 1  10      30    50
 2  10.1    30    50
 3  10.2    30    50
 4  10.3    30    50
 5  10.4    30    50
 6  10.5    30    50
 7  10.6    30    50
 8  10.7    30    50
 9  10.8    30    50
10  10.9    30    50
# … with 999,990 more rows

Output

param_grid(df, "Pear", length=100)

# A tibble: 1,000,000 x 3
   alpha  beta gamma
   <dbl> <dbl> <dbl>
 1  10      20    20
 2  10.2    20    20
 3  10.4    20    20
 4  10.6    20    20
 5  10.8    20    20
 6  11.0    20    20
 7  11.2    20    20
 8  11.4    20    20
 9  11.6    20    20
10  11.8    20    20
# … with 999,990 more rows

Now, I would like to write a for loop to allow this function to apply to multiple fruits:

names <- c("Apple","Orange","Pear")

for (i in names){
  results <- param_grid(df = df, fruit = i, length = 100)
  print(head(results),10)
  }

This works fine but it returns 3 dataframes altogether:

    alpha beta
1 20.00000   30
2 19.89899   30
3 19.79798   30
4 19.69697   30
5 19.59596   30
6 19.49495   30
     alpha beta gamma
1 20.00000   40    60
2 19.89899   40    60
3 19.79798   40    60
4 19.69697   40    60
5 19.59596   40    60
6 19.49495   40    60
     alpha beta gamma
1 30.00000   40    30
2 29.79798   40    30
3 29.59596   40    30
4 29.39394   40    30
5 29.19192   40    30
6 28.98990   40    30

Is there a way I can edit this for-loop, so that I can have 3 separate dataframes for Apple, Orange, Pear, respectively? Or it could be 3 dataframes each callble / subsettable within a big nested dataframe (e.g. DF[[Apple]], DF[[Orange]]..)?

Thanks so much for your help!

1

1 Answers

0
votes

We are looping on a for loop and just printing. Instead, we can store in a list

lst1 <- vector('list', length(names))
names(lst1) <- names
for (i in names){
  results <- param_grid(data=df, fruit = i, length = 100)
  lst1[[i]] <- results
  }

Then, check the structure of the list created

str(lst1)

We can extract the individual datasets with $ or [[

lst1[[1]]
lst1[[2]]

If we want to create different objects with object name same as the elements of 'names' vector

list2env(lst1, .GlobalEnv)

But, it is better to store in a list and use it