1
votes

I'm trying to import some data from a .txt file into WPS (which works with SAS language). Everythings worked perfectly until I realized that for one column/variable, some of my observations were cut. For example in the column/variable Type, my 5 first observations in the original file were:

RAN
FG
MAJOFG
ARBA
ARBV

but in my data set in sas, here's the data I get:

RAN
FG
MAJ
ARB
ARB

So I thought about using a format during the import to make sas always take the $8. format. Can it work with proc import? Here is my code:

proc import out = sasuser.frais
  datafile="~/NAS/M/XXX.txt"
  dbms=dlm replace;
  delimiter=";";
  getnames=yes;
  datarow=2;
run;

I also tried to use data infile but for some reason it moves my lines of data (the beginning of the second line starts at the end of the first one, probably because most of the time my last 4 columns/variables are empty). And I have more than 20 variables, most of which are properly imported by sas (in terms of format I mean), so if I could avoid naming a format for each of them, it would be great!

3
You may find this issue helpful communities.sas.com/t5/Base-SAS-Programming/…John Doe
Thanks! It helped indeed!videorama17
Are you really using PROC IMPORT for a file with only one variable? Or are you just showing the one variable that is causing trouble?Tom

3 Answers

1
votes

Try using the GUESSINGROWS statement:

Proc IMPORT … ;
  …
  GUESSINGROWS = 100;
run;

When using a DATA step with an INPUT statement, and some rows are incomplete, your INFILE statement should specify the MISSOVER option. This will help you create a data set where each row corresponds to each row of the data file.

  INFILE … FIRSTOBS=2 DLM=';' DSD MISSOVER;
  length v1-v20 $30;
  INPUT v1-v20;

During the data step implicit loop step, any variables at the end of the data line without corresponding input values will be set to missing, and the input statement will not attempt to look for more values on the next line.

0
votes

You may be able to get acceptable results by tweaking the guessingrows parameter rather than specifying an informat, e.g.:

data _null_;
  file "%sysfunc(pathname(work))\test.txt";
  infile cards;
  input;
  put _infile_;
  cards;
RAN
FG
MAJOFG
ARBA
ARBV
;
run;

proc import out = example
  datafile="%sysfunc(pathname(work))\test.txt"
  dbms=dlm replace;
  delimiter=";";
  getnames=yes;
  datarow=2;
  guessingrows=3;
run;

If you change this from 3 to 2, proc import only looks at the first 2 data rows when guessing an appropriate length for the variable RAN and truncation occurs for subsequent rows with longer values.

0
votes

My problem is that I plan to use this importation code for multiple files and the order of the observations for type is random. So it's totally possible that for 10 or 100 row, the observations have a length of 3 and then one observation of 6. But even if I don't know the order, I know the differents values possible (with a max length of 6). So guessingtows wouldn't work in my case. But I found a way to correct my code with the link provided by John Doe (https://communities.sas.com/t5/Base-SAS-Programming/Text-File-Import-force-the-correct-Informat-and-format/td-p/281701) I took the data generated by the proc import and modified it for the one variable which needed changes. Thank you all!