In tidy R, how do I parallelize a grouped summarize
(or mutate
) function call?
A transform to the iris
dataset illustrates my problem.
I created a simple function - it takes two numerical vectors as args. It returns a list with a 2-column tibble.
library(tidyverse)
geoMaxMean <- function(pLen, pWid){
list(
tibble(maxLen = max(pLen),
geoMean = sqrt(max(pLen) * max(pWid))))}
Applying this to iris
gIris <- iris %>%
as_tibble() %>%
group_by(Species) %>%
summarise(Cols2 = geoMaxMean(Petal.Length, Petal.Width)) %>%
unnest(Cols2)
Gives the intended result.
Species maxLen geoMean setosa 1.9 1.067708 versicolor 5.1 3.029851 virginica 6.9 4.153312
How do I parallelize the geoMaxMean
call? I've tried to rework the call with lappply
or foreach
but I haven't been able to figure it out.
I'm running R 3.4.4 on RStudio Pro.