0
votes

I have a list of Stata datasets: among some a variable tor is absent, and I want to add that variable if it doesn't exist.

The datasets contain a variable called xclass where x could be anything (e.g. Aclass, lclass, etc.). I would like to rename those variables to dec.

I want to create a variable adjusted which is "yes" if the file name contains adjusted and "no" if not.

I guess it would look something like:

Loop through list of datasets and their variables {
        if variable contains pattern class 
                        rename to dec
        if no variable tor, then 
                        gen str tor = total
        if file name contains pattern adjusted
                        gen str adjusted = yes
        else gen str adjusted = no
}

But then in proper Stata language.

So I've got this now, but it's not working, it doesn't do anything...

cd "C:\Users\test"
local filelist: dir "." files "*.dta", respectcase

foreach filename of local myfilelist {


   ds *class
     local found `r(varlist)' 
     local nfound : word count `found' 
     if `nfound' == 1 { 
        rename `found' dec
     } 
     else if `nfound' > 1 { 
        di as err "warning: multiple *class variables in `filename'" 
     } 

     capture confirm var tor 
     if !_rc == 0 { 
        gen tor = "total"
     } 

     gen adjusted = cond(strpos("`filename'", "_adjusted_"), "yes", "no") 
}
2
Take a look at these answers to a similar question for a method for storing and reading in file names.lmo
The local myfilelist is not defined, so the loop does nothing. Should be filelist.Nick Cox

2 Answers

1
votes

This is not an answer, this is advice that won't fit into a comment.

What you are attempting is not elementary Stata. If indeed you are unfamiliar with Stata (not stata) you will find it challenging to automate this process. I'm sympathetic to you as a new user of Stata - it's a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

When I began using Stata in a serious way, I started by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

All of these manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax.

The Stata documentation is really exemplary - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry.

With that said, you will perhaps find the foreach command helpful for looping, the filelist command for obtaining a list of Stata datasets (not databases), and the ds command for obtaining a list of variable names within a Stata dataset. More subtly, the capture command will let you attempt to generate your tor variable and will simply fail gracefully if it already exists, saving a small amount of program logic.

0
votes

The middle part can be sketched:

    // assumes local macro filename contains file name 

    ds *class
    local found `r(varlist)' 
    local nfound : word count `found' 
    if `nfound' == 1 { 
        rename `found' dec 
    } 
    else if `nfound' > 1 { 
        di as err "warning: multiple *class variables in `filename'" 
    } 

    capture confirm var tor 
    if _rc { 
        gen tor = "total"
    } 

    gen adjusted = cond(strpos("`filename'", "adjusted"), "yes", "no") 

On managing lists of files: filelist (SSC) is very good; also see fs (SSC) for a different approach.

EDIT: Here is proof of concept for the last detail:

. local filename1 "something adjusted somehow"

. local filename2 "frog toad newt dragon"

. di cond(strpos("`filename1'", "adjusted"), "yes", "no")
yes

. di cond(strpos("`filename2'", "adjusted"), "yes", "no")
no

strpos("<string1>", "<string2>") returns a non-zero result, namely the starting position of the second string in the first if the first contains the second. Non-zero as an argument means true in Stata; zero means false.

See help strpos() and if desired help cond().

I can't see your filenames to comment or test your code, but one possible problem is that the local macro is not defined in the same namespace as that in which you are trying to evaluate the expression. (That's what local means.) A macro that isn't defined will be evaluated as an empty string, with the result you mention.