1
votes

I am trying to estimate a model using fixed effects in R using plm package. My data looks like the following below it is firm, city, year, quarter level. And each of these I observe sales, and income by firm and city level by year-quarter. My regression is income ~ sales. That is sales on income, but looking to control for firm, and city specific unobservables. I have 1000+ firms in my actual dataset.

fid = c(1,1,1,1,
    2,2,2,2,
    3,3,3,3,3,3,3,3,
    4,4,4,4,5,5,5,5,
    5,5,5,5)

cityid = c(101,101,101,101,
       102,102,102,102,102,102,102,102,103,103,103,103,
       103,103,103,103,
       104,104,104,104,
       104,104,104,104)

year = c(2000, 2000, 2000, 2000,2000,2000, 2000,2000,2001,2001,2001,2001,2002,2002,2002,2002,
     2001,2001,2001,2001,2001,2001,2001,2001,2002,2002,2002,2002)

qtr = c(1,2,3,4,1,2,3,4,1,2,3,
    4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)

df = data.frame(fid, cityid,year,qtr,sales = sample(1:4,7, replace=T),income=30:57)

I see the plm function takes in panel specified by individual-time. That is each individual is observed over various time intervals. Now how could I use the plm package to run: 1.) firm fixed effects 2.) firm and city fixed effects 3.) firm, city, quarter fixed effects.

Could you distinguish? I am little confused regarding the time component, and wondering if I can use firm and city fixed effects too? In running the firm and city fixed effects, my panel would have each firm city repeated 4 times for the quarter, while each city may have multiple firms.

For 3.) can I combine firm, city using the plm command but explicitly control for quarter in the formula (like factor(quarter))?

Just wanted to get a clearer understanding of extending plm to estimate fixed effects, beyond just using time dimensions. I have already looked the vignette, but it is not totally clear. So any information would be great.

2

2 Answers

2
votes

I think you are a bit confused here. The unit of analysis in your dataset is the yearly quarter (lets call it q_year, coded for example as 2000_1, 2000_2, etc.). So you would want to generate such a variable and use it to index the time dimension.

This you then could specify as follows:

model <- plm(income ~ sales + as.factor(q_year), data= df, index=c("fid", "q_year"), 
      model="within")
summary(model)

This model gives you time-fixed effects (yearly quarter) as well as firm-fixed effects. Note, that in your example data 'city' does not vary over time. So it would be consumed by the firm-fixed effect (the city location is a fixed firm characteristic!).

(note: do you have data for some firms ranging over multiple years? Your example data does not have this. You would want to condens your example data to a four wave design and just take the quaters as time dimension, because this data structure effectively hold year constant for every firm.)

0
votes

I would suggest using felm as an alternative to plm. You specify all variables you want as fixed effects after a | in the formula.

model <- felm(income ~ sales | cityid + fid + qtr)

You should note that city fixed effects are not needed when firms are in a unique city only. The reason is that a firm fixed effects already holds everything constant that's not time-varying within a firm, i.e., their geographic location. Mathematically speaking, the fixed effects transformation subtracts the firm-level mean from the data, giving you a mean of zero. If you then form the city-level mean from all firms, so subtracting that mean from the data doesn't do anything.