I have a data file that looks like this:
001 Mayo Clinic 120 78 7 15
Patient has had a persistent cough for 3 weeks
023 Mayo Clinic 157 72 10 2
Patient complained of ear ache
064 HMC 201 59 . .
Patient left against medical advice
003 HMC 166 58 8 15
Patient placed on beta-blockers on 7/1/2006
I am finding the task of reading this into SAS to be basically impossible. And no, in this case, reformatting the data file is out of the question. So let me explain what you are looking at here:
Each subject has two lines of data. The first line is-
subject number / clinic / wt / hr / dx / sx (don't worry about what the numbers mean, thats irrelevant).
The second line is text, which is basically a note containing extra information referring to the subject whose data is laid out in the previous line. So, the lines:
001 Mayo Clinic 120 78 7 15
Patient has had a persistent cough for 3 weeks
Are for a SINGLE subject. Subject 001. These need to become a single row in a SAS data set. I am completely at a loss; because of the different lengths for the clinic names, and the number columns not being aligned, I can't figure out how to get SAS to read this. This is the closest I have been able to get:
data ClinData;
infile "&wdir.clinic_data.txt";
retain patno clinic weight hr dx sx exinfo;
input patno clinic $1. @;
if clinic='M' then
input patno @5 clinic $11. weight hr dx sx / @1 exinfo $30.;
else if clinic='H' then
input patno @5 clinic $3. weight hr dx sx / @1 exinfo $30.;
run;
This prints as:
http://i61.tinypic.com/2uswl90.png
All of the numerical values are in the right place.
However, this has a several problems.
First, the subject number ('patno') always shows up as a missing value. Why?
Second, the clinic is only represented by its first letter 'M' or 'H'. I can't get SAS to change the length of the clinic variable based on which clinic it is.
Third, the variable "exinfo" contains the notes about the patient. However, I can't get SAS to include the entire line. The highest I can get it is around 30 characters before the formatting goes haywire.
Any help? The SAS documentation is frustratingly poor for this type of input. None of the examples really match up with what I need, and it doesn't adequately explain how to use some of the options. I know I need to use column/line pointers; but the problem is that the columns aren't consistent from line to line. So no matter which pointer format I use there will still be lines that don't come out right.