0
votes

I have a data set with numeric data. The code is below:

data test;
infile 'C:\Users\Public\Documents\Test\test.dat';
 input a1 a2 a3 a4 b1 b2 b3 b4;
 run;

 proc print data=test;
 run;

When I run this I get the following error messages:

NOTE: Invalid data for a1 in line 1 1-51.
NOTE: Invalid data for a2 in line 2 1-50.
NOTE: Invalid data for a3 in line 3 1-50.
NOTE: Invalid data for a4 in line 4 1-50.
NOTE: Invalid data for b1 in line 5 1-51.
NOTE: Invalid data for b2 in line 6 1-51.
NOTE: Invalid data for b3 in line 7 1-51.
NOTE: Invalid data for b4 in line 8 1-51.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-

8   CHAR  18.597.6.261.4.032.0.215.-0.099.32.580.36.430.1.038 51
    ZONE  332333032333032333032333023233303323330332333032333
    NUMR  18E59796E26194E03290E2159D0E099932E580936E43091E038

How do I fix this? Does this error message come up because the numbers have too many digits?

Added. Here are some sample lines from my data:

21.312 7.039 5.326 .932 -.030 35.239 36.991 1.057
21.206 6.979 5.237 .871 .015 35.713 36.851 1.064

Also here is another part of the error message:

NOTE: Invalid data errors for file ''C:\Users\Public\Test\test.dat'' occurred
      outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
2
Can you give us a sample line or two of data from your file?John Chrysostom
@JohnChrysostom: I added some sample lines.user21478
Your code seems to be working just fine with those sample lines... Are you sure that the location of the file is correct? Also, can you grab some of the lines that are giving the invalid data?John Chrysostom
@JohnChrysostom: Those are the lines from the data I want to read in. The code is not working for any of the sample lines. I also added some other parts of the error message.user21478
I don't suppose you can drop the whole file somewhere? Is it large or sensitive?John Chrysostom

2 Answers

3
votes

It looks like SAS is seeing each line as one variable rather than one observation with multiple variables, which tells me it's not recognizing the delimiter correctly. If, as in the sample lines you posted, the delimiter is a space, your code should work. Alternatively, you can make sure it's using a space delimited format by doing the following:

data test;
    infile "C:\Users\Public\Documents\Test\test.dat" dlm=" ";
    input a1 a2 a3 a4 b1 b2 b3 b4;
run;

If it's actually tab delimited, you may need to use dlm='09'x instead.

Let us know if that helps.

1
votes

The . as a delimiter is confusing it. It is trying to read 18.579.6.261 ... as a number. Which it isn't, which is causing the error. If possible, use a space as a delimiter and your statement will work.