2
votes

Sorry if the title is a bit convoluted, as I didn't know how else to explain this issue. Basically, I'm attempting to mutate a percent of group variable utilizing dplyr. However, I'm running into an issue where the new calculated variable appears numeric, and even calculates when using summary(), but will not allow me to call mean() or sd() without throwing me the following error:

Warning message:
In mean.default(., group_pct) :
  argument is not numeric or logical: returning NA 

Here are some examples of what is going on.

data(mtcars)

mtcars %>% 
  group_by(cyl) %>% 
  mutate(group_pct = hp / sum(hp)) %>% 
  summary()

enter image description here

Note: group_pct is calculating correctly when called via summary()...

data(mtcars)

mtcars %>% 
  group_by(cyl) %>% 
  mutate(group_pct = hp / sum(hp)) %>% 
  mean(group_pct)

enter image description here

...but when I call for the mean here, it cannot complete the function. Even when I use ungroup() and/or na.rm = TRUE, the function still doesn't work. I don't understand what the issue is here.


EDIT: For clarification, I'm hoping to do something like this...

mtcars %>% 
  group_by(cyl) %>% 
  mutate(group_pct = hp / sum(hp)) %>% 
  paste0('Words: ', mean(group_pct))

Hoping for this final result:

Words: 0.09375

...which I don't think I can use summarize() for, hence my non-inclusion of it from the start. Apologies for any inconveniences.

4
From my personal experience, expected output is most helpful. Do you just want a character vector that is the length of the number of rows of mtcars with each element of the form "Words: #.###"?zack
My bad, edited again for expected output. I'm hoping for one phrase and one numerical result which is the average of the rowwise group proportions.medavis6

4 Answers

4
votes

Per OP's Clarification:

mtcars %>% 
  group_by(cyl) %>% 
  mutate(group_pct = hp / sum(hp)) %>%
  pull(group_pct) %>%
  mean() %>%
  paste0("Words: ", .)

[1] "Words: 0.09375"
3
votes

You want the baseR function with().

mtcars %>% 
  group_by(cyl) %>% 
  mutate(group_pct = hp / sum(hp)) %>%
  with(paste0('Words: ', mean(group_pct)))

[1] "Words: 0.09375"

The issue with your original attempt is that group_pct is not defined in the global environment, so you get the error message, when it can't locate it in the lookup.

with is the syntactic sugar that tells R to evaluate the paste0() function call within the environment of the data frame being passed by pipe. So it finds group_pct and returns your expected result.

1
votes

The problem is with how you're piping into the mean function. Look at your error message:

Warning message:
In mean.default(., group_pct)

You're trying to get the mean of the group_pct column of the piped data frame, but instead what's happening is mean is receiving both the entire piped data frame (the . represents the output of the pipe) and an object group_pct which does may or may not exist.

Take a look at this answer: https://stackoverflow.com/a/38475455/8366499

If you want to subset the piped data.frame in the mean function, you need to use curly braces {} so the call to mean gets treated like an expression, not a function call. Then, you can subset the . object as desired:

mtcars %>% 
    group_by(cyl) %>% 
    mutate(group_pct = hp / sum(hp)) %>% 
    {mean(.$group_pct)} %>%
    paste0('Words: ', .)

[1] "Words: 0.09375"
0
votes
library(tidyverse)
library(purrr)

mtcars %>% 
  mutate(group_pct = hp / sum(hp)) %>% 
  summarise_all(mean) %>%
  select(group_pct) %>%
  map(function(x) paste0(" Word ", x))

and the result is:

"Word 0.03125"