0
votes

I'm trying to add a dummy variable to a panel data set with time, id and many other variables.

library(zoo)
geo = c("AT","AT","AT","BE","BE","BE","DE","DE","DE")
time = c("1990Q1","1990Q2","1990Q3","1990Q1","1990Q2","1990Q3","1990Q1","1990Q2","1990Q3")
Data <- as.data.frame(cbind(geo, time))
Data$time = as.yearqtr(Data$time)

which in reality has 20 countries and 97 quarters. I wont get around addressing 'geo' element by element but (time > 2004Q1) for example would be great

I want a dummy for Austria and Germany starting in 1990 Q2. So I would like to arrive at:

    geo time     dummmy
1   AT  1990 Q1  0
2   AT  1990 Q2  1
3   AT  1990 Q3  1
4   BE  1990 Q1  0
5   BE  1990 Q2  0
6   BE  1990 Q3  0
7   DE  1990 Q1  0
8   DE  1990 Q2  1
9   DE  1990 Q3  1

I cant get anywhere close, I'm thinking in a stata logic (generate variable if this is that and this is something else) but the closest I am in R is to create separate country dummies, then cbinding each with the time variable and subsetting them on the time variable before extracting all the single dummies and adding them together before cbinding that with my original data. That cannot be anywhere close to the best solution (and doesn't completely work) because it is around 40 lines of code... This should be quite easy to do, no?

Any help would be great!

p.s.: My attempts go along these lines:

AT <- as.numeric(Data$geo == "AT")
DE <- as.numeric(Data$geo == "DE")

AT <- as.data.frame(cbind(Data$time, AT))
DE <- as.data.frame(cbind(Data$time, DE))

but I think I'm off into the wrong direction and I cant get the time dimension right...

2

2 Answers

2
votes

It looks like you must be using the zoo library for the as.yearqtr function. If that's the case, that makes the "time" column comparable with the standard comparison operators. So basically it looks like you just want all the values where time>"1990Q1" and "geo" is either "AT" or "DE". You can do that with

Data$dummy<-(Data$time>as.yearqtr("1990Q1") & Data$geo %in% c("AT","DE"))+0

here do +0 to turn the logical true/false into 0/1

1
votes

You can use standard comparisons with yearqtr objects, so try:

Data$time >= "1990 Q2"
# [1] FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
Data$geo %in% c("AT", "DE") & Data$time >= "1990 Q2"
# [1] FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE

Data$dummy <- as.numeric(Data$geo %in% c("AT", "DE") & Data$time >= "1990 Q2")
Data
#   geo    time dummy
# 1  AT 1990 Q1     0
# 2  AT 1990 Q2     1
# 3  AT 1990 Q3     1
# 4  BE 1990 Q1     0
# 5  BE 1990 Q2     0
# 6  BE 1990 Q3     0
# 7  DE 1990 Q1     0
# 8  DE 1990 Q2     1
# 9  DE 1990 Q3     1