0
votes

I am trying to change an income_change categorical variable from 5 groups to 3 groups.

Currently the variable looks:

tab income_change                  frequency
Decreased by more than 25% |        333
        Decreased by 1-25% |        331
           Stayed the same |        222
        Increased by 1-25% |         23
Increased by more than 25% |         12

And the variable is stored as:

         storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------
income_change            int     %26.0g      Lchg

To create three groups based on the five categories above, I ran this, but I get this error message "type mismatch"

gen perc_change = income_change            
recode   perc_change ="Income Decreased"  if perc_change =="1"  | if perc_change =="2"
recode   perc_change ="Same Income"  if perc_change =="3"
recode   perc_change ="Income Increased"  if perc_change =="4" | if perc_change =="5"

The perc_change variable is stored as follows:


              storage   display    value
variable name   type    format     label      
--------------------------------------------------------------------------------------------------------------------------
perc_change     float   %9.0g 

Solved with the proposed solution below:

gen inc_change = income_change 
gen inc_perc_change = ""
replace inc_perc_change ="Income Decreased"  if inc_change == 1 | inc_change == 2
replace inc_perc_change ="Same Income"       if inc_change_perc == 3
replace inc_perc_change ="Income Increased"  if inc_change_perc == 4 | inc_change_perc == 5
tab inc_perc_change 

Produced the graph I was looking for with this:

catplot  tn_cor22_str inc_perc_change, percent(tn_cor22_str)
3

3 Answers

2
votes

Alternatively, you could use:

gen perc_change = ""
replace perc_change ="Income Decreased"  if inrange(perc_change, 1, 2)
replace perc_change ="Same Income"       if perc_change == 3
replace perc_change ="Income Increased"  if inrange(perc_change, 4, 5)
2
votes

It seems that income_change is a numeric variable with text labels. Could you try something like:

gen perc_change = ""
replace perc_change ="Income Decreased"  if income_change == 1 | income_change == 2
replace perc_change ="Same Income"       if income_change == 3
replace perc_change ="Income Increased"  if income_change == 4 | income_change == 5
tab perc_change 

If the above code does not work, it is likely that the values of income_change are not 1 to 5. You will need to change 1-5 to the relevant values of income_change in your data to set the right conditions.

1
votes

Although you got what you asked for, the resulting variable is imperfect, as (in particular) it won't even sort the way you want. Another possibility is a coarsened numeric variable with new value labels, as done by say

gen change_class = 1 if inlist(perc_change, "1", "2") 
replace change_class = 2 if perc_change == "3" 
replace change_class = 3 if inlist(perc_change, "4", "5") 
label def change_class 1 Decreased 2 Same 3 Increased 
label val change_class change_class