0
votes

I have a binary data set with no delimiters and no fixed length records. I know each record contains 22 bytes of data then an unknown number of 23 byte blocks, up to 50 blocks. The problem is that it's only reading 1 line of 32767 bytes for a total of 728 obs. I'm expecting 2.7MM output obs. How can I make this read the input file to the end? I've already tried adding an "OBS=" option and "lrecl=" option to the infile line. Adding the "end=" option had no effect on the result.

DATA INFILE.MYDATA (drop= i);
INFILE "&Path./UGLYDATA" end=eof; 
INPUT
MY_KEY s370fPD9.
...
OCCURS s370fPD2.
@
;    
ARRAY   MyData{50}  MyData1-MyData50;
...
ARRAY   Filler{50} $ Filler1-Filler50;

DO I = 1 TO min(50,OCCURS);
INPUT
MyData{I}   s370fPD4.
...
Filler{I}   $ebcdic10.
@@
;
End;
RUN;

Relevant Log:

NOTE: 1 record was read from the infile "UGLYDATA".
      The minimum record length was 32767.
      The maximum record length was 32767.
      One or more lines were truncated.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set INFILE.MYDATA has 728 observations and 356 variables.
NOTE: Compressing data set INFILE.MYDATA decreased size by 47.06 percent. 
      Compressed is 9 pages; un-compressed would require 17 pages.
NOTE: DATA statement used (Total process time):
      real time           2.69 seconds
      user cpu time       0.02 seconds
      system cpu time     0.11 seconds
      memory              1890.40k
      OS Memory           10408.00k
      Timestamp           12/07/2021 05:17:34 PM
      Step Count                        1  Switch Count  0
      Page Faults                       3
      Page Reclaims                     1028
      Page Swaps                        0
      Voluntary Context Switches        272
      Involuntary Context Switches      1226
      Block Input Operations            309648
      Block Output Operations           2312
1
It looks like you are trying to read IBM mainframe data file on a Unix machine. How did you get the file to Unix machine? Is it just a pure binary stream of bytes? What type of file was it on the mainframe? - Tom

1 Answers

0
votes

Sounds like the file does not consists of lines of text. So try using RECFM=N on your INFILE statement so that SAS will not be looking for LINEFEED character (or CARRIAGE RETURN and LINEFEED combination) to mark the end of the lines.

INFILE "&Path./UGLYDATA" recfm=n end=eof; 

If you are unsure what the file contains just run a simple data step to look at the first few hundred bytes and then figure it out. If any of the bytes in a "line" are not printable characters the LIST command will include the hexcodes for the bytes under the lines when it writes to the SAS log.

data _null_;
  INFILE "&Path./UGLYDATA" recfm-=f lrecl=100 obs=10 ;
  input;
  list;
run;