0
votes

I’m working with some raw data that has fixed column widths, but has all its records written into a single line (blame the data vendor, not me :-) ). I know how to use fixed column widths in the INPUT statement, and how to use @@ to read more than one observation per line, but I am having trouble when I try to do both.

As an example, here’s some code where the data has fixed column widths, but there is one line per record. This code works fine:

DATA test_1;
    INPUT alpha $ 1-5   beta $ 6-10   gamma 11-15 ;

    DATALINES;
a    f    1
ab   fg   12
abc  fgh  123
abcd fghi 1234
abcdefghij12345
    ;
RUN;

Now here’s the code for what I’m really trying to do – all the data is in one line, and I try to use the @@ notation:

DATA test_2;
    INPUT alpha $ 1-5    beta $ 6-10    gamma 11-15 @@;

    DATALINES;
a    f    1    ab   fg   12   abc  fgh  123  abcd fghi 1234 abcdefghij12345
    ;
RUN;

This fails because it just keeps reading the beginning 15 characters, holding that record, and re-reading from the start. Based on my understanding of the semantics of the @@ notation, I can definitely understand why this would be happening.

Is there any way I can accomplish reading fixed column data from a single line; that is, make test_2 have the same content as test_1? Perhaps through some combination of symbols in the INPUT statement, or maybe resorting to another method (with file I/O functions, PROC IMPORT, etc.)?

1

1 Answers

2
votes

Have you tried specifying variable lengths using informats?

For example:

DATA test_2;
    INPUT alpha  $5.    beta $5.    gamma 5.0 @@;

    DATALINES;
a    f    1    ab   fg   12   abc  fgh  123  abcd fghi 1234 abcdefghij12345
;
RUN;

From the SAS documentation:

Formatted input causes the pointer to move like that of column input to read a variable value. The pointer moves the length that is specified in the informat and stops at the next column.