I am doing an analysis in Stata of the determinants of census tract unemployment rates. Some of the previous literature on my topic has used straight OLS regression, and I started with this type of analysis, but it seems to me after my own further reading that a Generalized Linear Model is better. This is especially because I am interested in presenting predicted values for the census tracts' unemployment rates based on my regression and I would like these to be appropriately bounded (between 0% and 100% inclusive). My unemployment rates include 0s for some census tracts so I would need to take this into account.
My questions are:
whether Stata's
fracreg logit
is equivalent to the program'sglm
with a logit link and binomial family? (I have read about using theglm
version in a few places including here but see thatfracreg
is a new-ish command which seems to serve the same purpose). Can I specify an equivalent to therobust
option when usingfracreg logit
?if using
fracreg
, on what basis should I decide to use a fractional probit (fracreg probit
) or fractional logit (fracreg logit
) regression?a simply (probably ignorant) question of interpretation: I see that the
fracreg
andglm
regressions mentioned above don't report an R-squared value. Is there an equivalent measure for these regressions I can calculate? My OLS R-squared values have been reasonably high and this has been a point of reassurance for me, so I'd like to see how these models compare (though I know R-squared isn't everything!).if using these models are there any additional restrictions or assumptions (such as additional assumptions beyond the BLUE of OLS) that I should keep in mind? With my OLS regressions I have taken the natural log of unemployment rates (makes my residuals more normal, higher R-squared, and convenient interpretation). Could I do the same with the
fracreg
orglm
regressions above?
It's been a while since I formally studied limited dependent variables so please excuse my ignorance on these issues.
I have cross-posted this question at Statalist here.