I have a dataframe with revenue numbers for different clients in different years. I want to calculate the total revenue for a client in a year. Then I want to calculate the mean of yearly revenue for the 3 highest grossing years.
I manage to do this with group_by and summarize. However in this way I "lose" all variables not used in grouping or calculation functions. I changed summarize to mutate but for one reason or another, this gives me different results.
Does anyone know why this happpens, or how to get same result in a different manner?
MASTERDATA%>%
group_by(Client,Year)%>%
summarise(Revenue=sum(Total, na.rm=TRUE))%>%
slice_max(n=3,order_by=Revenue,with_ties = FALSE )%>%
mutate(Revenue_top_3 = mean(Revenue))%>%
select(Client,Year,Revenue)%>%
ungroup()%>%
distinct()
Applying this to the mtcars dataset
mtcars%>%
group_by(gear,carb)%>%
mutate(Revenue=sum(mpg, na.rm=TRUE))%>%
slice_max(n=3,order_by=Revenue,with_ties = FALSE )%>%
mutate(Revenue_top_3 = mean(Revenue))%>%
select(gear,carb,Revenue)%>%
ungroup()%>%
distinct()