0
votes

I have a dataframe with a value column and the corresponding year. I want to create an additional column that should contain the ratio of value for years at an interval of 5 years going backward. EX. If the year is 2000, the 'newval' column should have the ratio for value for the year 2000 and 1995. My data frame looks like. There might be a missing year and no data in both the value and year column.

df2 = data.frame(code = c("AFG", "AGO", "ALB", "AND", "ARB", "ARE", "ARG", "ARM", "ASM", "ATG", "AUS", "AUT","AUT"),
            val = c(123, 42, 23, 5, 42, 4, 23, 25, 42, 23, NA, 5563,56), 
            Year = c(1990, 1991, 1992, 1993, 1991, 1995, 1996, 1997, 1991, 1992, 2000, 2001,2002))

The final dataset should look like this

 df2 = data.frame(code = c("AFG", "AGO", "ALB", "AND", "ARB", "ARE", "ARG", "ARM", "ASM", "ATG", "AUS", "AUT","AUT"),
             val= c(123, 42, 23, 5, 42, 4, 23, 25, 42, 23, NA, 5563,56),
             Year = c(1990, 1991, 1992, 1993, 1991, 1995, 1996, 1997, 1991, 1992, 2000, 2001,2002), newval=c(NA,NA,NA,NA,NA,0.032520325,0.547619048,1.086956522,NA,NA,NA,241.8695652,2.24))
2
@ShantanuSharma look at the tags. First tag is rSotos
This si a question for R . The question has R tagArihant
The year 1991 appears twice. The year 1996 once. How do you decide which ratio to take?Cettt
Each year has a unique value even though they might appear multiple time. ex the value for 1991 is 42 for both the instances.Arihant
isn't this a (small) adaptation on the solutions provided on your previous question?? stackoverflow.com/questions/55944339/…Wimpel

2 Answers

5
votes

In base R, we can use match

df2$new_val <- with(df2, val/val[match(Year - 5, Year)])

df2
#   code  val Year  new_val
#1   AFG  123 1990       NA
#2   AGO   42 1991       NA
#3   ALB   23 1992       NA
#4   AND    5 1993       NA
#5   ARB   42 1991       NA
#6   ARE    4 1995   0.0325
#7   ARG   23 1996   0.5476
#8   ARM   25 1997   1.0870
#9   ASM   42 1991       NA
#10  ATG   23 1992       NA
#11  AUS   NA 2000       NA
#12  AUT 5563 2001 241.8696
#13  AUT   56 2002   2.2400
1
votes

One possibility using the dplyr package would be the following:

df2 %>% mutate(Year = Year + 5) %>% select(-code) %>% distinct() %>% 
  left_join(df2, ., by = "Year", suffix = c("", "_old")) %>% 
  mutate(newval = val / val_old) %>% 
  select(-val_old)

   code  val Year       newval
1   AFG  123 1990           NA
2   AGO   42 1991           NA
3   ALB   23 1992           NA
4   AND    5 1993           NA
5   ARB   42 1991           NA
6   ARE    4 1995   0.03252033
7   ARG   23 1996   0.54761905
8   ARM   25 1997   1.08695652
9   ASM   42 1991           NA
10  ATG   23 1992           NA
11  AUS   NA 2000           NA
12  AUT 5563 2001 241.86956522
13  AUT   56 2002   2.24000000