I have a dataframe df and would like to make a new column populated by the minimum value by group of a second column. Prior posts do not address this issue in the context of making new columns while preserving the original rows and columns of the dataframe.
Suppose this sample input:
a <- c(1,1,1,1,2,2,2,2)
b <- c(NA,1,2,2,3,5,6,NA)
df <- data.frame(a,b)
df
a b
1 NA
1 1
1 2
1 2
2 3
2 5
2 6
2 NA
What I want to achieve is this output:
a b Min_b
1 NA 1
1 1 1
1 2 1
1 2 1
2 3 3
2 5 3
2 6 3
2 NA 3
Here are my attempts with corresponding output:
df %>% group_by(a) %>% mutate(Min_b = min(b, na.rm = TRUE))
a b Min_b
1 NA 1
1 1 1
1 2 1
1 2 1
2 3 1
2 5 1
2 6 1
2 NA 1
The above gives me the minimum of column b, rather than the minimum of column b by the groups of column a (i.e., I want the latter).
df %>% group_by(a) %>% top_n(-1, wt = b)
a b
1 1
2 3
The above works for finding the right values but does not seem to work within mutate, as follows:
df1 %>% group_by(a) %>% mutate(Min_of_b = top_n(-1, wt = b))
Error in is_scalar_integerish(n) : argument "n" is missing, with no default
Thank you for any suggestions on alternative ways to do this!
dput(head(df))
. Additionally, it is not clear to me what your expected ouput should look like. – r2evansdf %>% group_by(id) %>% mutate(new_column = min(second_column))
instead. – AntoniosKdf %>% group_by(a) %>% mutate(Min_b = min(b, na.rm = TRUE))
works for me.... – A5C1D2H2I1M1N2O1R2T1