2
votes

I have a data which has a start time, end time, Stay Time(End-Start) and # of subjects. I am trying to create a graph, line, bar or histogram which shows how many subjects were there at a specific time. So the horizontal axis would have time from 00:00 to 24:00 and vertical axis would have the total # of subjects or the %.

Start    End    Subject    Stay
01:00    02:00    1        01:00
01:00    01:45    1        00:45
02:00    21:00    1        19:00
03:10    14:10    1        11:00

The data set is huge and I am using SAS Enterprise Guide and excel to create a graph.

I have tried PROC GPLOT but it doesn't provide me what I am looking for. Did the same thing for line plots and bar charts(stacked) but couldn't. I am not sure if there is an easier way to do it . This is a code I used in SAS EG. Tried creating a stacked bar chart in excel as well.

PROC GPLOT DATA=Input;
PLOT Stay * start  /
AREAS=1
FRAME   VAXIS=AXIS1
HAXIS=AXIS2
;

RUN; QUIT;

Please help.

Thanks

1

1 Answers

0
votes

You would need to transform your data to create one row for each time period you were looking at (hour, for example, if you wanted to show # of people who were present at any point during each hour).

You can do something like this:

data want;
set have;
do time=intnx('Hour',start,0) to end by 3600; *start at top of current hour, increment by 1 hour (3600 seconds);
  output;
end;
run;

Then you can graph the time variable in a bar chart.

Your data may be a problem for some of these approaches (such as ETS) because you're using overlapping periods - you have subject 1 having 4 stays that overlap heavily. If these are different days you might want to add a day marker to the subject to make them unique.

Example using your data:

data have;
input Start :time5. End :time5. Subject Stay :time5.;
format start end stay time5.;
datalines;
01:00    02:00    1        01:00
01:00    01:45    2        00:45
02:00    21:00    2        19:00
03:10    14:10    3        11:00
;;;;
run;

data want;
set have;
do hour_mark = intnx('Hour',start,0) to end by 3600;
 output;
end;
keep hour_mark subject;
format hour_mark time5.;
run;

proc sgplot data=want;
vbar hour_mark;
run;

You could run the same example using a more interesting dataset:

data have;
if _n_=1 then call streaminit(7);
do subject = 1 to 100;
    start=floor(rand('Uniform')*86000);*almost all day, but make sure we have a bit of room for end;
    end  =floor(rand('Uniform')*(86400-start))+start;
    stay=end-start;
    output;
end;
format start end stay time5.;
run;

and then use the same WANT and SGPLOT code.