Multiple local in foreach command macro

Question

I have a dataset with multiple subgroups (variable economist) and dates (variable temps99).

I want to run a tabsplit command that does not accept bysort or by prefixes. So I created a macro to apply my tabsplit command to each of my subgroups within my data.

For example:

levelsof economist, local(liste)

foreach gars of local liste {
    display "`gars'"
    tabsplit SubjectCategory if economist=="`gars'", p(;) sort 
    return list
    replace nbcateco = r(r) if economist == "`gars'"
}

For each subgroup, Stata runs the tabsplit command and I use the variable nbcateco to store count results.

I did the same for the date so I can have the evolution of r(r) over time:

levelsof temps99, local(liste23)

foreach time of local liste23 {
    display "`time'"
    tabsplit SubjectCategory if temps99 == "`time'", p(;) sort
    return list
    replace nbcattime = r(r) if temps99 == "`time'"
}

Now I want to do it on each subgroups economist by date temps99. I tried multiple combination but I am not very good with macros (yet?).

What I want is to be able to have my r(r) for each of my subgroups over time.

Nick Cox Nick Cox · Accepted Answer · 2017-10-29T07:59:57

This is an example of the XY problem, I think. See http://xyproblem.info/

tabsplit is a command in the package tab_chi from SSC. I have no negative feelings about it, as I wrote it, but it seems quite unnecessary here.

You want to count categories in a string variable: semi-colons are your separators. So count semi-colons and add 1.

local SC SubjectCategory
gen NCategory = 1 + length(`SC') - length(subinstr(`SC', ";", "", .))

Then (e.g.) table or tabstat will let you explore further by groups of interest.

To see the counting idea, consider 3 categories with 2 semi-colons.

. display length("frog;toad;newt")
14

. display length(subinstr("frog;toad;newt", ";", "", .))
12

If we replace each semi-colon with an empty string, the change in length is the number of semi-colons deleted. Note that we don't have to change the variable to do this. Then add 1. See also this paper.

That said, a way to extend your approach might be

egen class = group(economist temps99), label 
su class, meanonly 
local nclass = r(N)
gen result = . 

forval i = 1/`nclass' {
    di "`: label (class) `i''" 
    tabsplit SubjectCategory if class == `i', p(;) sort
    return list
    replace result = r(r) if class == `i'
}

Using statsby would be even better. See also this FAQ.

Multiple local in foreach command macro

2 Answers