Apply one number for each value within range

Question

So I have a column with values, which fall within certain ranges of years (see below). I have gotten the average for each range using aggregate(). But, when I try to apply this average number for each value, I get an error. For example, for each value that is within the 1900-1910 range, I want my average for that range to appear in that row, under my "Avg" column

What I can get:

Range      Avg
1900-1910  15.33
1911-1920   6.67
....
1941-1950  22.00

Want:

Value Year   Range       Avg
12    1906   1900-1910   15.33
15    1909   1900-1910   15.33
7     1911   1911-1920    6.67
22    1950   1941-1950   22.00
4     1917   1911-1920    6.67
9     1917   1911-1920    6.67
19    1902   1900-1910   15.33

I am able to get the averages for each range, but I cannot figure out how to apply the Avg for the range to each specific value. The only thing I can think of is a bunch of nested ifelse() statements, but that seems too tedious. For example:

d$Avg<-ifelse(Range=="1900-1910",15.33,
       ifelse(Range=="1911-1920",6.67,
       ...etc))

Is there a way that I can speed this process up instead of using a bunch of nested ifelse statements?

chappers chappers · Accepted Answer · 2015-12-01T04:24:13

The solution is to think of the aggregated data as a lookup table and then use merge to get the desired data set.

So if the aggregated data is lookupdf, then we can use merge like this:

final_df <- merge(d, lookupdf, by=c("Range"))

Sample code to demonstrate this:

d <- data.frame(Year=rep(1900+c(1:20), 20),
           Value=runif(400, 1, 20))

d$Range <- ifelse(d$Year <= 1910, "1900-1910", "1911-1920")

library(dplyr)
# generate the aggregation; should be same as what you have above.
lookupdf <- d %>% group_by(Range) %>% summarise(Avg=mean(Value))

# base R version
final_df <- merge(d, lookupdf, by=c("Range"))

Output:

> head(final_df[final_df$Year %in% c(1910, 1911),])
   Year     Value     Range      Avg
10 1910 18.643543 1900-1910 11.17740
11 1911  1.142544 1911-1920 10.18118
30 1910 11.187802 1900-1910 11.17740
31 1911  9.887889 1911-1920 10.18118
50 1910  5.316916 1900-1910 11.17740
51 1911 15.365103 1911-1920 10.18118

Apply one number for each value within range

2 Answers