0
votes

I would like to fill the missing observation(s) with the values of the next cell and distribute it equally over the missing rows.

For example using data from below, I would fill value for 2004m1 and 2004m2 with 142 and also replace value for 2004m3 with 142, as we fill two missings (142 = 426/3). For 2005m7/m8 it would be 171 etc. I am able to fill the missings with revered sorting and carryforward, however I cannot figure out how to redistribute the values, especially that the number of rows that I try to fill can vary and it is not simple [_n+1].

My try to fill the values (but this does not redistribute):

carryforward value, gen(value_filled)

Example data set:

date_m  value
2005m12 56
2005m11 150
2005m10 190
2005m9  157
2005m8  342
2005m7  .
2005m6  181
2005m5  151
2005m4  107
2005m3  131
2005m2  247
2005m1  100
2004m12 77
2004m11 181
2004m10 132
2004m9  153
2004m8  380
2004m7  .
2004m6  174
2004m5  178
2004m4  104
2004m3  426
2004m2  .
2004m1  .

Expected result

date_m  value
2005m12 56
2005m11 150
2005m10 190
2005m9  157
2005m8  171
2005m7  171
2005m6  181
2005m5  151
2005m4  107
2005m3  131
2005m2  247
2005m1  100
2004m12 77
2004m11 181
2004m10 132
2004m9  153
2004m8  190
2004m7  190
2004m6  174
2004m5  178
2004m4  104
2004m3  142
2004m2  142
2004m1  142
1

1 Answers

1
votes

Thanks for your data example, which is helpful, but as detailed in the Stata tag wiki and on Statalist an example using dataex is even better. Date and time variables are especially awkward otherwise.

You allude to carryforward, which is from SSC and which many have found useful. Having written the FAQ on this accessible here my prejudice is that most such problems yield quickly and directly to sorting, subscripting and replace. Your problem is trickier than most in including a value to be divided after an unpredictable gap of missing values.

This works for your example and doesn't rule out a simpler solution.

* Example generated by -dataex-. To install: ssc install dataex
clear
input float date int mvalue
551  56
550 150
549 190
548 157
547 342
546   .
545 181
544 151
543 107
542 131
541 247
540 100
539  77
538 181
537 132
536 153
535 380
534   .
533 174
532 178
531 104
530 426
529   .
528   .
end
format %tm date

gsort -date 
gen copy = mvalue 
replace copy = copy[_n-1] if missing(copy) 

gen gap = missing(mvalue[_n+1]) | missing(mvalue)
replace gap = gap + gap[_n-1] if gap == 1 & _n > 1 
sort date 
replace gap = gap[_n-1] if inrange(gap[_n-1], 1, .)  & gap >= 1  

gen wanted = cond(gap, copy/gap, copy)

list , sepby(gap)


     +----------------------------------------+
     |    date   mvalue   copy   gap   wanted |
     |----------------------------------------|
  1. |  2004m1        .    426     3      142 |
  2. |  2004m2        .    426     3      142 |
  3. |  2004m3      426    426     3      142 |
     |----------------------------------------|
  4. |  2004m4      104    104     0      104 |
  5. |  2004m5      178    178     0      178 |
  6. |  2004m6      174    174     0      174 |
     |----------------------------------------|
  7. |  2004m7        .    380     2      190 |
  8. |  2004m8      380    380     2      190 |
     |----------------------------------------|
  9. |  2004m9      153    153     0      153 |
 10. | 2004m10      132    132     0      132 |
 11. | 2004m11      181    181     0      181 |
 12. | 2004m12       77     77     0       77 |
 13. |  2005m1      100    100     0      100 |
 14. |  2005m2      247    247     0      247 |
 15. |  2005m3      131    131     0      131 |
 16. |  2005m4      107    107     0      107 |
 17. |  2005m5      151    151     0      151 |
 18. |  2005m6      181    181     0      181 |
     |----------------------------------------|
 19. |  2005m7        .    342     2      171 |
 20. |  2005m8      342    342     2      171 |
     |----------------------------------------|
 21. |  2005m9      157    157     0      157 |
 22. | 2005m10      190    190     0      190 |
 23. | 2005m11      150    150     0      150 |
 24. | 2005m12       56     56     0       56 |
     +----------------------------------------+