4
votes

I'm mostly work with observational data, but I read a lot of experimental hard-science papers that report results in the form of anova tables, with letters indicating the significance of the differences between the groups, and then p-values of the f-stat for the joint significance of what is essentially a factor variable regression. Here is an example that I've pulled off of google image search.

I think that this might be a useful way to present summary statistics about groupwise differences (or lack thereof) in an observational dataset, before I go ahead and try to control for them in various ways. I'm not sure exactly what test the letters are typically representing (Tukey something?), but pairwise t-tests would suit my purposes fine.

My main question: how can I get such an output from a factor variable regression in R, and how can I seamlessly export it into latex?

Here is some example data:

var = c(3500,600,12400,6400,1500,0,4400,400,900,2000,350,0,5800,0,12800,1200,350,800,2500,2000,0,3200,1100,0,0,0,0,0,1000,0,0,0,0,0,12400,6000,1700,3500,3000,1000,0,0,3500,5000,1000,3600,1600,3500,0,900,4200,0,0,0,0,1700,1250,500,950,500,600,1080,500,980,670,1200,600,550,4000,600,2800,650,0,3700,12500,0,0,0,1200,2700,0,NA,0,0,0,3700,2000,3500,0,0,0,3500,800,1400,0,500,7000,3500,0,0,0,0,2700,0,0,0,0,2000,5000,0,0,7000,0,4800,0,0,0,0,1800,0,2500,1600,4600,0,2000,5400,4500,3200,0,12200,0,3500,0,0,2800,3600,3000,0,3150,0,0,3750,2800,0,1000,1500,6000,3090,2800,600,0,0,1000,3800,3000,0,800,600,1200,0,240,1000,300,3600,0,1200,300,2700,NA,1300,1200,1400,4600,3200,750,300,750,1200,700,870,900,3200,1300,1500,1200,0,960,1800,8000,1200,NA,0,1080,1300,1080,900,700,5000,1500,3750,0,1400,900,1400,400,3900,0,1400,1600,960,1200,2600,420,3400,2500,500,4000,0,4250,570,600,4550,2000,0,0,4300,2000,0,0,0,0,NA,0,2060,2600,1600,1800,3000,900,0,0,3200,0,1500,3000,0,3700,6000,0,0,1250,1200,12800,0,1000,1100,0,950,2500,800,3000,3600,3600,1500,0,0,3600,800,0,1000,1600,1700,0,3500,3700,3000,350,700,3500,0,0,0,0,1500,0,400,0,0,0,0,0,0,0,500,0,0,0,0,5600,0,0,0)
factor = as.factor(c(5,2,5,5,5,3,4,5,5,5,3,1,1,1,5,3,6,6,6,5,5,5,3,5,3,3,3,3,4,3,3,3,4,3,5,5,3,5,3,3,3,3,5,3,3,3,3,3,5,5,5,5,5,3,3,5,3,5,5,3,5,5,4,3,5,5,5,5,5,5,4,5,3,5,4,4,3,4,3,5,3,3,5,5,5,3,5,5,4,3,3,5,5,4,3,3,5,3,3,4,3,3,3,3,5,5,3,5,5,3,3,5,4,3,3,3,4,4,5,3,1,5,5,1,5,5,5,3,3,4,5,5,5,3,3,4,5,4,5,3,5,5,5,3,3,3,3,3,3,3,3,3,3,3,4,3,3,3,3,3,3,3,4,5,4,6,4,3,5,5,3,5,3,3,4,3,5,5,5,3,5,3,3,5,5,5,3,4,3,3,3,5,3,5,3,5,5,3,5,3,5,5,5,5,5,3,5,3,5,3,4,5,5,5,6,5,5,5,5,4,5,3,5,3,3,5,4,3,5,3,4,5,3,5,3,5,3,1,5,1,5,3,5,5,5,3,6,3,5,3,5,2,5,5,5,1,5,5,6,5,4,5,4,3,3,3,5,3,3,3,3,5,3,3,3,3,3,3,5,5,5,4,4,4,5,5,3,5,4,5,5,4,3,3,3,4,3,5,5,4,3,3))

do a simple regression on them and you get the following

m = lm((var-mean(var,na.rm=TRUE))~factor-1)
summary(m)
Call:
lm(formula = (var - mean(var, na.rm = TRUE)) ~ factor - 1)

Residuals:
    Min      1Q  Median      3Q     Max 
-2040.5 -1240.2  -765.5   957.1 10932.8 

Coefficients:
        Estimate Std. Error t value Pr(>|t|)  
factor1   -82.42     800.42  -0.103   0.9181  
factor2  -732.42    1600.84  -0.458   0.6476  
factor3  -392.17     204.97  -1.913   0.0567 .
factor4   -65.19     377.32  -0.173   0.8629  
factor5   408.07     204.13   1.999   0.0465 *
factor6   303.30     855.68   0.354   0.7233  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2264 on 292 degrees of freedom
  (4 observations deleted due to missingness)
Multiple R-squared:  0.02677,   Adjusted R-squared:  0.006774 
F-statistic: 1.339 on 6 and 292 DF,  p-value: 0.2397

It looks pretty clear that factors 3 and 5 are different from zero, different from each other, but that factor 3 is not different from 2 and factor 5 is not different from 6, respectively (at whatever p value).

How can I get this into anova table output like in the example above? And is there a clean way to get this into latex, ideally in a form that allows a lot of variables?

1
Looks like the letters are grouping the factors according to the results of some pairwise test. I'm not sure that's standard enough that it shouldn't be noted somewhere in the paper.Scortchi - Reinstate Monica
I'm pretty sure they're just ttests, and anyway ttests are fine for my purposes. What I really want is a function to make these for me given a dataset and a model. Going to try to get this moved over to SO.generic_user
Look into Sweave, for one method of getting LaTeX results. Also see the latex function in the Hmisc packagePeter Flom

1 Answers

5
votes

The following answers only the third question.

It looks like xtable does what you'd like to do - exporting R tables to $\LaTeX$ code.

There's a nice gallery as well.

I've found both in a wiki post on stackoverflow.