2
votes

I'm importing a text file into SAS, using the code below :

proc import datafile="C:\Users\Desktop\data.txt" out=Indivs dbms=dlm replace;
 delimiter=';';
   getnames=yes;
run;

However, I get error messages in the log and certain fields are populated with "." in place of the real data and I don't know what is the problem.

The error message is :

Invalid data for DIPL in line 26 75-76.
Invalid data for DIPL in line 28 75-76.
Invalid data for DIPL in line 31 75-76.
Invalid data for DIPL in line 34 75-76.

A sample of the data is available here http://m.uploadedit.com/b029/1392916373370.txt

3
can you post your log? Also, there is a renegade apostrophe in your datafile= option..Allan Bowe
Ok, so it seems that the DIPL variable is causing the problems and is thus being filled with"."s. SAS reads it in as best12., it's actually a $2. variable - is this the cause of the problem?user2568648
Most likely. Joe is correct - using the infile statement is a much better approach here. You can rip the code from the log (generated by the proc import) and adjust to your needs. If you hold the ALT key whilst selecting, you can avoid the line numbers..Allan Bowe
The data file you have posted now doesn't have semi-colons in it, so I changed the delimiter in your code to a space and SAS read the text file just fine for me. I also tried replacing spaces with semi-colons in the data and using your original code - and that also worked fine - DIPL was recognized as a character. Do you still have an issue?cmjohns
Ok, the data sample I previously updated was treated in R - so I think it changed the variables to character as it should which is why it works fine. However the original data set (telechargement.insee.fr/fichiersdetail/RP2010/txt/…), when using the infile statement and changing DIPL to $2. just shifts the error elsewhere...I get other fields filled with "."s.user2568648

3 Answers

5
votes

Don't use PROC IMPORT in most cases for delimited files; you should use data step input. You can use PROC IMPORT to generate initial code (to your log), but most of the time you will want to make at least some changes. This sounds like one of those times.

data want;
infile "blah.dat" dlm=';' dsd lrecl=32767 missover;
informat
trans $1.
triris $1.
typc $6.
;
input
trans $
triris $
typc $
... rest of variables ...
;
run;

PROC IMPORT generates code just like this in your log, so you can use that as a starting point, and then correct things that are wrong (numeric instead of character, add variables if it has too few as the above apparently does, etc.).

1
votes

I copied the text file from your link, and ran your code (without the apostrophe):

proc import datafile="C:\temp\test.txt" out=Indivs dbms=dlm replace;
 delimiter=';';
   getnames=yes;
run;

And it worked fine despite the following:

Number of names found is less than number of variables found.

Result:

NOTE: WORK.INDIVS data set was successfully created.
NOTE: The data set WORK.INDIVS has 50 observations and 89 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.30 seconds
      cpu time            0.26 seconds

enter image description here

0
votes

If log has this "Number of names found is less than number of variables found." then it creates new variables which have blank values.