0
votes

So I have used the following code to split the below dataframe (df1) into multiple dataframes/tibbles based on the filters so that I can work out the percentile rank of each metric.

df1:

name group metric value
A A distance 10569
B A distance 12939
C A distance 11532
A A psv-99 29.30
B A psv-99 30.89
C A psv-99 28.90
split <-  lapply(unique(df1$metric), function(x){
                 filter <- df1 %>% filter(group == "A" & metric == x)
})

This then gives me a large list of tibbles. I want to now mutate a new column for each tibble to work out the percentile rank of the value column which I can do using the following code:

df2 <- split[[1]] %>% mutate(percentile = percent_rank(value))

I could do this for each metric then row_bind them together, but that seems very messy. Could anyone suggest a better way of doing this?

3

3 Answers

1
votes

No need to split the data here. You can use group_by to do the calculation for each metric separately.

library(dplyr)

df %>%
  filter(group == "A") %>%
  group_by(metric) %>%
  mutate(percentile = percent_rank(value))
1
votes

We can use base R

df1 <- subset(df, group == 'A')   
df1$percentile <- with(df1, ave(value, metric, FUN = percent_rank))
0
votes
df %>% 
  group_nest(group, metric) %>% 
  mutate(percentile = map(data, ~percent_rank(.x$value))) %>% 
  unnest(cols = c("data", "percentile"))