0
votes

I have data in Stata with 3 variables, a string id and numeric variables (GPS data - latitude and longitude). I would like to convert the variables into a matrix in the following way (the lower table) to calculate the distance between two id-spots for all combinations. So a newly created subsequent column (e.g, id_1) has a subsequent(i+1) value of the original variable (e.g., id), and so on. However, the following command works only until the n-th row is reached to get a value; then the subsequent new rows become empty. Thus, the half bottom of the matrix gets missing (the upper table: ///) . For 2000 observations:

foreach num of numlist 1/2000 {
   foreach var of varlist id num1 num2   {
        gen `var'_`num'=`var'[_n+`num']
    }
}

enter image description here

1
Your specific aim is to create 6000 new variables each with 2000 observations to later calculate (2000 x 1999) / 2 = 1999000 distances. If the latter is the goal, consider e.g. geonear from SSC. If you want travel distances, there are several existing community-contributed commands, e..g search distance. - Nick Cox
Suppose you have 2000 observations. Then any reference to whatever[2001] or to values in later observations is legal but just returns missing as there is no such observation. - Nick Cox
Thank you Nick. I deleted the post without realising your reply as I figured it myself that I can use "expand 2" to duplicate the whole data set and run the command, then delete the unneeded ones afterards. In this way I could have the whole matrix data. I needed the whole matrix rather than half diagonal. Yes I used geodist which works a wonder. Thank you again for the advice! - Makiko

1 Answers

0
votes

I post an answer if anybody finds any use.

//duplicate all observation to create all filled matrix

expand 2, gen(dupindex) 

forvalue i = 1/1999 {
    foreach var of varlist id num1 num2 {
    gen `var'`i'=`var'[_n+`i']
        }
}

//delete the unnecessary columns & rows

forvalue i = 2000/3999 {
    drop id`i' num1`i' num2`i'  
}
drop in 2001/3999
drop dupindex