2
votes

I have a longitudinal data set with recurring observations (id 1,2,3...) per year. I have thousands of variables of all types. Some rows (indicated by a variable to_interpolate == 1) need to have their numeric variables linearly interpolated (they are empty) based on values of the same id from previous and next years.

Since I can't name all variables, I created a varlist of numeric variables. Also, I do not want to recreate thousands of extra variables, so I need to replace the existing missing values.

What I did so far:

quietly ds, has(type numeric)
local varlist `r(varlist)'

sort id year
foreach var of local varlist {
   by id: ipolate `var' year replace(`var') if to_interpolate==1
}

No matter what I do, I get an error message:

factor variables and time-series operators not allowed
r(101);

My questions:

  1. How is the 'replace' even proper syntax? if not, how to replace the existing variable values instead of creating new variables?
  2. If the error means that factors exist in my varlist - how to detect them?
  3. If not, how to get around this?

Thanks!

2
Your error message is possibly a result of your references to local macros r(varlist) and var being given as for example 'var' where they should be given as `var' - the leftmost character is the so-called "left single quote" character which appears on my keyboard in the upper left corner beneath the tilde character. (It is technically the ASCII "accent grave" character.)user4690969
The ipolate command does not include a replace() option. So do it in 3 commands. (1) use ipolate `var' ... to generate a new variable named temp; (2) replace `var' = temp (3) drop temp.user4690969
@WilliamLisowski (1) the left ` exists in my code, just not when transcribing it here - will fix. (2) How to detect a macro or why it exists in a varlist that is only numeric (3) but how do you do this with thousands of variables?Yuval Spiegler

2 Answers

2
votes

As @William Lisowski underlines, there is no replace() option to `ipolate'. Whatever is not allowed by its syntax diagram is forbidden. In any case, keeping a copy of the original is surely to be commended as part of an audit trail.

sort id 
quietly ds, has(type numeric)

foreach var in `r(varlist)' {
   by id: ipolate `var' year, gen(`var'2) 
}
0
votes

Ok, this is a workaround since I can't find a way to replace values with ipolate that is feasible for thousands of variables:

quietly ds, has(type double float long int)
local varlist `r(varlist)'

sort id year

foreach var of local varlist {
   quietly by id: replace `var' = (`var'[_n-1] + `var'[_n+1])/2 if to_interpolate==1
}

This is a linear interpolation, which will work for single year gaps, but not for two years in a row, but for my purposes it is enough. I will be very happy to see a better solution :)