I have a dataset of patient diagnoses with one diagnosis code per line, resulting in patient diagnoses on multiple lines. Each patient has a unique patientID. I also have age, race, gender, etc. data on these patients.
How do I indicate to SAS when using PROC FREQ, Logistic, Univariate, etc. that they are the same patient?
This is an example of what the data looks like:
patientID diagnosis age gender lab
1 15.02 65 M positive
1 250.2 65 M positive
2 348.2 23 M negative
2 282.1 23 M negative
3 50 F positive
I was given data on every patient who has had a certain lab (regardless of positive result), as well as all of their diagnoses, which each appear on a different line (as a different observation to SAS). First, I will need to exclude every patient who has a negative result for the lab, which I plan on using an IF statement for. The lab determines if the patient has disease X. Some patients do not have any additional diseases, other than disease X, such as patient #3.
Analyses I would like to perform:
- Calculate the frequency of each disease using PROC FREQ.
- Characterize the age and race relationships for each diagnosis using PROC FREQ chi square.
- PROC Logistic to determine risk factors (age, race, gender, etc.)for developing an additional disease on top of disease X.
Thanks!