2
votes

The following bit of SAS code is supposed to read from a dataset which contains a numeric variable called 'Radvalue'. Radvalue is the temperature of a radiator, and if a radiator is switched off but then its temperature increases by 2 or more it's a sign that it has come on, and if it is on but its temperature decreases by 2 or more it's a sign that it's gone off. Radstate is a new variable in the dataset which indicates for every observation whether the radiator is on or off, and it's this I'm trying to fill in automatically for the whole dataset. So I'm trying to use the LAG function, trying to initialise the first row, which doesn't have a dif_radvalue, and then trying to apply the algorithm I just described to row 2 onwards. Any idea why the columns Radstate and l_radstate come out completely blank?

Thanks everso much!! Let me know if I haven't explained the problem clearly.

Data work.heating_algorithm_b;
 Input ID Radvalue; 
 Datalines; 
  1 15.38 
  2 15.38 
  3 20.79 
  4 33.47 
  5 37.03 
  6 40.45 
  7 40.45 
  8 40.96 
  9 39.44 
  10 31.41 
  11 26.49 
  12 23.06 
  13 21.75 
  14 20.16 
  15 19.23 
 ; 

DATA temp.heating_algorithm_c;
 SET temp.heating_algorithm_b;

 DIF_Radvalue = Radvalue - lag(Radvalue);

 l_Radstate = lag(Radstate);

 if missing(dif_radvalue) then  
  do;
   dif_radvalue = 0;
   radstate = "off"; 
  end;                            
 else if l_Radstate = "off"  &  DIF_Radvalue > 2    then Radstate = "on";
 else if l_Radstate = "on" &  DIF_Radvalue < -2  then  Radstate = "off";
 else  Radstate = l_Radstate;
run;
2
post some sample input data, ideally with datalines or cards, to help solve logic problems with your code, if any - Jay Corbett
Hi, thanks – here’s some sample data: Input ID Radvalue; Datalines; 1 15.38 2 15.38 3 20.79 4 33.47 5 37.03 6 40.45 7 40.45 8 40.96 9 39.44 10 31.41 11 26.49 12 23.06 13 21.75 14 20.16 15 19.23 ; - SAS_learner

2 Answers

0
votes

I have no SAS experience, but maybe you need a missing(l_Radstate) check to cover the first time through, maybe something like this:

if missing(l_Radstate) then
do; radstate = "off"; end; 

I think that would only be needed if the Radvalue - lag(Radvalue) did not force DIF_Radvalue to be missing. If it does, I am not sure what would help...

0
votes

You were trying to perform the LAG function on a variable only existing in the output data set (RADSTATE). I replaced the LAG on RADSTATE with a RETAIN. Also, you were right to keep the LAG function outside any conditional logic...Try the below code.

Data work.heating_algorithm_b;
 Input ID Radvalue; 
 Datalines; 
  1 15.38 
  2 15.38 
  3 20.79 
  4 33.47 
  5 37.03 
  6 40.45 
  7 40.45 
  8 40.96 
  9 39.44 
  10 31.41 
  11 26.49 
  12 23.06 
  13 21.75 
  14 20.16 
  15 19.23 
 ; 

DATA work.heating_algorithm_c;
 length radstate $3;
 retain radstate;
 SET work.heating_algorithm_b;

 old_radvalue=lag(radvalue);

 if _n_=1 then do;
  dif_radvalue=0;
  radstate="off";
 end;
 else do;
  DIF_Radvalue = Radvalue-Old_Radvalue;

  if Radstate = "off"  &  DIF_Radvalue > 2    then Radstate = "on";
  else if Radstate = "on" &  DIF_Radvalue < -2  then  Radstate = "off";
  /* Else Radstate stays the same */
 end;
run;