
I am trying to create many tables of cross-tabs in the style of tab (twoway) or tabout in Stata. However, I want to include response options for both variables, even if every cell for that variable in the twoway tabulation would be zero. For instance, if we alter the auto dataset slightly:

sysuse auto
tab rep78 foreign

* Altering data
replace foreign=. if foreign==0 & rep78==1 
tab rep78 foreign

In the altered data, it is possible that there could be a car with rep78==1, but there are no instances of that in the data among observations that are non-missing for the variable foreign.

What I want is a table like:

Record  Car type
1978    Domestic    Foreign Total
1   0   0   0
2   8   0   8 
3   27  3   30 
4   9   9   18 
5   2   9   11          
Total   46  21  67 

Nick Cox's very nice tabcount works for this purpose but

  1. I don't want to have to specify all the possible response options to include


  1. I want to be able to export the tables cleanly as in tabout. I often have to create batches of tables where a lot of cells in the tables will have zero observations due to missingness, and I often have many, many possible response options.

Any ideas on a workaround that will allow me to still use tabout or something similar?

I understand your #1 as a human preference, but you need to explain how Stata could possibly be expected to know what is possible or impossible unless you tell it what is. Why not show 6, 7, 8, ... as possible? The only way I can see to make this programmable is to get Stata to look for value labels.Nick Cox
Hi Nick. "Possible" is the wrong word choice. I mean that I want Stata to determine all the response options for var1 that are represented in the data, even if some of those responses are not represented among respondents who also answered var2.Amberopolis
(got cutoff by the time limit for edits) When I make crosstabs for someone else, it is confusing to see some response options that they know exist are not represented in the table--they can assume that row would be full of zeroes, but I would rather represent it explicitly.Amberopolis
Thanks for helpful further comments.Nick Cox

1 Answers


As in my initial comments, there has to be some sense in which Stata is told what defines the rows and columns, or how to find that out. Nevertheless, this may help:

* ssc inst tabcount 

sysuse auto, clear 
tab rep78 foreign

levelsof rep78, local(rows) 
levelsof foreign, local(cols) 

replace foreign=. if foreign==0 & rep78==1 

tabcount rep78 foreign, v1(`rows') v2(`cols') zero 


  1. tabcount must be installed before you do this. Use ssc inst tabcount to do that. The command above is commented out as a signal that it is not needed after the first use.

  2. You could write your own command or do-file based on looking-up values in the data before looking at the cross-combination desired.

  3. The paper here may be helpful as a broader discussion.