0
votes

I am a novice R user. I installed Zelig version 4.1-3 and Amelia II version 1.7. I am puzzled on how I can obtain the degrees of freedom, t-statistic, and f-values of combined multiply imputed data using R packages and functions.

First, I loaded Amelia and Zelig:

require(Amelia)
require(Zelig)

Then, I loaded the sample data that came with Amelia:

data(freetrade)

I created 5 imputations for this dataset using the amelia function.

a.out <- amelia(freetrade, m = 5, ts = "year", cs = "country")  

Then, to combine imputations, I used the zelig function:

z.out.imp <- zelig(tariff ~ polity + pop + gdp.pc + year + country, 
     data = a.out$imputations, model = "ls" )

However, I got coefficients that appeared to be coefficients of individual imputations, and not those of the combined set when I used this code:

summary(z.out.imp)

They were as follows:

Coefficients:
                           Value   Std. Error     t-stat      p-value
(Intercept)         2.766176e+03 6.670110e+02  4.1471215 0.0003572868
polity              1.645011e-01 3.078134e-01  0.5344183 0.5938286336
pop                -6.079963e-08 6.518429e-08 -0.9327345 0.3774275934
gdp.pc             -4.246794e-04 1.945866e-03 -0.2182470 0.8319093062
year               -1.335563e+00 3.519513e-01 -3.7947390 0.0009787456
countryIndonesia   -7.000319e+01 4.646330e+01 -1.5066343 0.1700377061
countryKorea       -8.643855e+01 4.671629e+01 -1.8502870 0.0926657863
countryMalaysia    -8.815182e+01 5.389486e+01 -1.6356256 0.1393312364
countryNepal       -8.215250e+01 5.475828e+01 -1.5002753 0.1702129176
countryPakistan    -4.349869e+01 5.149729e+01 -0.8446791 0.4238033944
countryPhilippines -8.088975e+01 5.320694e+01 -1.5202857 0.1673234716
countrySriLanka    -7.668840e+01 5.695485e+01 -1.3464771 0.2161986616
countryThailand    -7.400481e+01 5.186395e+01 -1.4269026 0.1903428838

The Amelia manual shows what some of the coefficients for the combined multiply imputed dataset should be although there is no explanation on how to obtain all of them using R (see page 46 of http://cran.r-project.org/web/packages/Amelia/vignettes/amelia.pdf)

Complete DF = 167
DF:   min   = 10.36
      avg   = 18.81
      max   = 37.62
F( 2, 10.4) = 15.50
Prob > F    = 0.0008


                    Value        Std. Error     t-stat       p-value
polity             -0.206        0.39            -0.53       0.61 
pop                -3.21 e-08    8.72e-09         3.68       0.004
gdp.pc             -0.0027       0.000644        -4.28       0.000
Intercept           32.7         2.66            12.29       0.000

Because the amelia function uses monte carlo simulations, we can expect small differences between runs. However, the huge difference in the intercept was a clue that the zelig function returned regression statistics for something else than the combined set.

The Amelia manual provides this code:

> b.out <-NULL
> se.out <-NULL
> for(i in 1:a.out$m){
+ ols.out <- lm(tariff ~ polity + pop + gdp.pc, data = a.out$imputations[[i]])
+ b.out <- rbind(b.out, ols.out$coef)
+ se.out <-rbind(se.out, coef(summary(ols.out))[,2])
+ }
> combined.results<-mi.meld(q=b.out, se = se.out)
> combined.results

I tried using it. The returned results are very close to the values and standard errors shown on page 46:

$q.mi
     (Intercept)     polity          pop       gdp.pc
[1,]    33.17325 -0.1499587 2.967196e-08 -0.002724229

$se.mi
     (Intercept)   polity          pop       gdp.pc
[1,]    2.116721 0.276968 6.061993e-09 0.0006596203

However, they do not include the t-statistic, degrees of freedom, or f-values.

Are there open-source packages or functions available in R so that I can obtain the degrees of freedom, t-statistic, and f-values without having to do manual calculations?

Thanks.

1

1 Answers

1
votes

Here is an annotated and edited transcript of my attempt to reproduce your problem:

> library(Zelig)

ZELIG (Versions 4.1-3, built: 2013-01-30)

> a.out <- amelia(freetrade, idvars="country", m = 5)
Error: could not find function "amelia"

The first problem I had is that you did not mention that we need to load the Amelia package. After correcting that I tried again to run the first line:

> library(Amelia)
## (Version 1.7, built: 2013-02-10)
> a.out <- amelia(freetrade, idvars="country", m = 5)
Error in amelia(freetrade, idvars = "country", m = 5) : 
  object 'freetrade' not found

This fails because you did not say how to get the freetrade data. Guessing here:

> data(freetrade)
> a.out <- amelia(freetrade, m = 5)
Amelia Error Code:  38 
 The following variable(s) are characters: 
     country
You may have wanted to set this as a ID variable to remove it
from the imputation model or as an ordinal or nominal
variable to be imputed.  Please set it as either and
try again. 

Your example does not work because the amelia function needs to be told what to do with character variables. So I modified your example in order to get one that would actually run:

> a.out <- amelia(freetrade, idvars="country", m = 5)
> z.out.imp <- zelig(tariff ~ polity + pop + gdp.pc + year + country, 
+      data = a.out$imputations, model = "ls")

Running summary on the result gives me the combined model statistics"

# This part works just fine.
> summary(z.out.imp)

  Model: ls
  Number of multiply imputed data sets: 5 

Combined results:

Call:
lm(formula = formula, weights = weights, model = F, data = data)

Coefficients:
                           Value   Std. Error     t-stat      p-value
(Intercept)         3.294715e+03 6.425487e+02  5.1275725 1.330807e-05
polity              2.761343e-01 3.354271e-01  0.8232319 4.145813e-01
pop                -6.443769e-08 5.526885e-08 -1.1658953 2.659143e-01
gdp.pc              4.549885e-04 1.354139e-03  0.3359984 7.382138e-01
year               -1.599422e+00 3.306932e-01 -4.8365739 2.649602e-05
countryIndonesia   -7.396526e+01 4.112206e+01 -1.7986760 1.009329e-01
countryKorea       -9.673542e+01 5.036909e+01 -1.9205317 8.713903e-02
countryMalaysia    -9.271187e+01 4.998836e+01 -1.8546690 9.424041e-02
countryNepal       -8.863525e+01 4.920061e+01 -1.8015072 9.990792e-02
countryPakistan    -4.789370e+01 4.362907e+01 -1.0977475 2.960914e-01
countryPhilippines -8.548672e+01 4.662372e+01 -1.8335456 9.533829e-02
countrySriLanka    -8.446560e+01 4.939918e+01 -1.7098586 1.170170e-01
countryThailand    -8.026702e+01 4.741244e+01 -1.6929529 1.213329e-01

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

In short, the only thing in your example that works for me is the one thing you claim does not work for you. Please post the code and output showing exactly what you did and exactly what happened, because at the moment I don't have enough information to help solve the problem.