I am trying to compute the overall growth rate of each NUTS2 region (column NUTS_CODE) for the years 2000-2006 (REF_YEAR).
My dataset looks like this:
NUTS_CODE NUTS_LEVEL SCENARIO_ID REF_YEAR IND_VALUE NUTS_C
<chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 BE10 2 1 2000 49434 BE
2 BE21 2 1 2000 29019 BE
3 BE22 2 1 2000 20646 BE
4 BE23 2 1 2000 21155 BE
5 BE24 2 1 2000 24963 BE
6 BE25 2 1 2000 22912 BE
So I am trying to compute something like
(BE10(which(REF_YEAR == 2006 - BE10(which(REF_YEAR==2000))/BE10(which(REF_YEAR==2000)
(this is not my actual code - I just want to explain what I want to achieve) and this Needs to be done for each and every NUTS_CODE.
I have tried to achieve this by using both a for loop as well as the dplyr function, but somehow it does not work.
library(dplyr)
data$growth<-NA
for(i in 1:nrow(data))
{
if((data%>%filter(NUTS_CODE == data$NUTS_CODE[i] &
SCENARIO_ID == data$SCENARIO_ID[i] &
REF_YEAR == (data$REF_YEAR[i]-1)
)%>%nrow()
) == 0
)
{
data$growth[i]<-0
} else {
data$growth[i]<-(((data$IND_VALUE[i]-
(data%>%filter(NUTS_CODE == data$NUTS_CODE[i] &
SCENARIO_ID == data$SCENARIO_ID[i] &
REF_YEAR == (data$REF_YEAR[i]==2006)
)
)[,"IND_VALUE"]
)/
(
(data%>%filter(NUTS_CODE == data$NUTS_CODE[i] &
SCENARIO_ID == data$SCENARIO_ID[i] &
REF_YEAR == (data$REF_YEAR[i]==2000)
)
)[,"IND_VALUE"]
)
)
*100)
}
print(paste("",i,sep = " "))
}
I do not get an error or a warning, but in data$growth I get a full column of numeric(0) instead of actual values.
Help is appreciated!
data$
in the pipe, the data set is already known since its beginning. I also believe that the code can be made (much) simpler, there is no need for afor
loop and to get all growth rates at the same time,group_by/mutate
seems to be more natural thanfilter
. – Rui BarradasIND_VALUE
for eachNUTS_CODE
? – Lisardo Erman