3
votes

Using different sources, I wrote a little function that creates a table with standard errors, t statistics and standard errors that are clustered according to a group variable "cluster" after a linear regression model. The code is as follows

cl1 <- function(modl,clust) {
 # model is the regression model
 # clust is the clustervariable
 # id is a unique identifier in ids
    library(plm)
    library(lmtest)
        #  Get Formula
    form <- formula(modl$call)
        # Get Data frame
    dat <- eval(modl$call$data)
    dat$row <- rownames(dat)
    dat$id <- ave(dat$row, dat[[deparse(substitute(clust))]], FUN =seq_along)       
    pdat <- pdata.frame(dat, 
         index=c("id", deparse(substitute(clust)))
         , drop.index= F, row.names= T)
    # # Regression
      reg <- plm(form, data=pdat, model="pooling")  
    # # Adjustments
     G <- length(unique(dat[, deparse(substitute(clust))]))
     N <- length(dat[,deparse(substitute(clust))])
    # # Resid degrees of freedom, adjusted
     dfa <- (G/(G-1))*(N-1)/reg$df.residual
     d.vcov <- dfa* vcovHC(reg, type="HC0", cluster="group", adjust=T)
    table <- coeftest(reg, vcov=d.vcov)
    # #  Output: se, t-stat and p-val
     cl1out <- data.frame(table[, 2:4])
     names(cl1out) <- c("se", "tstat", "pval")
    # # Cluster VCE
     return(cl1out)

}

For a regression like reg1 <- lm (y ~ x1 + x2 , data= df), calling the function cl1(reg1, cluster) will work just fine.

However, if I use a model like reg2 <- lm(y ~ . , data=df), I will get the error message:

Error in terms.formula(object) : '.' in formula and no 'data' argument

After some tests, I am guessing that I can't use "." to signal "use all variables in the data frame" for {plm}. Is there a way I can do this with {plm}? Otherwise, any ideas on how I could improve my function in a way that does not use {plm} and that accepts all possible specifications of a linear model?

1
The way you have this set up, y~. will (try to...) include all the columns of pdat except y, but including row, id and clust. Are you sure this is what you want to do?jlhoward
Oh, thanks! Trying to solve my "." bug I had not seen this one!Doon_Bogan

1 Answers

6
votes

Indeed you can't use . notation for formula within plm pacakge.

data("Produc", package = "plm")
plm(gsp ~ .,data=Produc)
Error in terms.formula(object) : '.' in formula and no 'data' argument

One idea is to expand the formula when you have a .. Here is a custom function that does the job (surely is done within other packages):

expand_formula <- 
  function(form="A ~.",varNames=c("A","B","C")){
  has_dot <- any(grepl('.',form,fixed=TRUE))
  if(has_dot){
    ii <- intersect(as.character(as.formula(form)),
          varNames)
    varNames <- varNames[!grepl(paste0(ii,collapse='|'),varNames)]

   exp <- paste0(varNames,collapse='+')
   as.formula(gsub('.',exp,form,fixed=TRUE))

  }
  else as.formula(form)
}

Now test it :

(eform = expand_formula("gsp ~ .",names(Produc)))
#    gsp ~ state + year + pcap + hwy + water + util + pc + emp + unemp

plm(eform,data=Produc)

# Model Formula: gsp ~ state + year + pcap + hwy + water + util + pc + emp + unemp
# <environment: 0x0000000014c3f3c0>