I compute predicted factor scores for many different measures as well as the measurement error for these factor scores, then delete the measures. I delete the measures because my data is quite large; I do not want all of the measures using RAM in the large data set where I run my analysis.
For my analysis, I regress on factors and other variables. I can correct the regression coefficients for measurement error by using the measurement error of these factor scores. However, I cannot figure out a convenient way to save the measurement error associated with each factor to a .dta file.
Why am I not running all of this in one Stata session, thus obviating the need to save the scalars/macros/matrices? I work on a server and on my PC. The server has lot of memory and processing power, but is very inconvenient to use. So I often break my work into two stages. First, I clean the data and reduce variable count (in this case, computing factor scores from a large set of measures). Cleaning the data itself often takes a huge amount of memory when I use multiple reshape commands. Then I save the cleaned data to a .dta file and work on it on my PC. The cleaned data is small enough to run on my PC, and doesn't require the sort of manipulations that use an excess of RAM.
I have considered a few approaches.
Create a variable for the measurement error of each factor. While this can work, I don't like it for several reasons: A) This is a profligate use of memory. I need only one scalar per factor variable, but I am creating _N cells for this variable. While I may be able to make the data set fit in memory by judiciously dropping variables or using other workarounds, I want a better solution. B) It just seems conceptually wrong.
Create a variable that contains all scalar values, and a second variable that contains the name for these scalars (ie, the factor to which they associate. I am having trouble making this work. How do I extract the value for each non-missing _n and put it into a Stata matrix or Mata matrix? Alternatively, how do I create a set of macros where the macro name comes from the variable containing the name and the macro value comes from the variable containing the scalar?
Somehow save the scalars/macros/Mata matrix/Stata matrix directly and load after opening the .dta file. Apparently, Stata does not save scalars, macros, Mata matrices, or Stata matrices to a .dta file. So the most convenient and obvious solution doesn't exist in Stata. I have seen other people recommend putting the scalars into a mata matrix, then loading the mata matrix into memory and saving as a .dta file. Then I could open this file, save it to a mata matrix, then load the data I want to work on. All this seems needlessly complicated and I am hoping there is a better way.
I would love advice on a simpler method to save these scalars, or else a way to make one of the above approaches simpler.
This is very frustrating. While Stata is extremely powerful and easy to use for a variety of things, it also has these frustrating 'holes' where you can spend an entire day trying make something work that you thought would be quite simple.
Stunningly, the easiest solution would be to simply copy the scalars to a spreadsheet and manually input them. This is not an automated solution, but I am realizing it would take me only maybe a quarter-of-an-hour instead of the enormous time it is taking me to automate this.
help mkmat
, on going from matrices to variables, and vice versa. – Roberto Ferrer