0
votes

I have a particular problem. I have exported a csv file where I on some columns needed to put the data in quoation-marks because of leading zeros, and sometimes a long datanumber includes "E" in them on the export. Now I am trying to import the same file into SAS to see if my proc import-routine works.

When I import the file all of the data comes through, but are compressed into two columns(hence wrong with my delimiter?) when I actually exported 20 columns.

Not all columns are enclosed in quotation-marks, just a couple of them. An example of the data:

CustomerID  CustomerName Product  Price  BillingNR 

"01234"       Customer 1   Product1 Price1 "03541"     
"52465"       Customer 2   Product2 Price2 ""          
"23454"       Customer 3   Product3 Price3 "035411236952154589632154"

CustomerID and BillingNR are then enclosed in quotation marks.

How can I import this dataset when only some of the columns are enclosed in quotation marks while others arent? Or simply remove all double quotes from the when importing? Heres my code:

%macro import;

%if &exist= "Yes" %then %do;
    proc import
    datafile= "\\mypath\data.csv"
        DBMS=CSV
        out=Sales
        replace;
        getnames=YES;
    run;
%end;

%else %do;
%put Nothing happens;
%end;


%mend;

%lesInn;

The IF/ELSE-test is just another macro where i test if the file specified exists. I have tried to research different methods, and am still looking for similar problems, but nothing have seemed to work.

All answers much appreciated.

Toor

2
does the csv really look like in your example? Then you would have to add delimiter='|';. Or is this just a abstract sample?kl78
Sorry just abstract example. Removed the pipes from the exampleJaz
What is the question? Are you asking if PROC IMPORT is clever enough to figure out which columns are numeric and which are character?Tom

2 Answers

0
votes

CSV -> Comma Separated Values I don't see commas being used as your delimiters, but pipes.

Specify that your delimiter is a pipe, and increase the GUESSINGROWS option to a large number so it assigns the correct length and type.

Proc import ... DBMS = DLM Replace;
Delimiter='|'; 
GuessingRows=10000; 
....remaining options;
 Run;

I'm still not sure Proc Import will work. If it doesn't you'll need to write the data step code and make sure to specify the DSD option which will deal with the quotes.

Edit: Based on question edit, most accurate method is to read via a data step. As mentioned the DSD option will handle the quotes.

0
votes

If you read the file using the DSD option then SAS will automatically remove the quotes from around the values. Even quotes that are around values that do not need to be quoted, like most of your example data.

data want ;
  infile cards dsd truncover firstobs=2;
  length CustomerID $5 CustomerName $20 Product $20 Price $8  BillingNR $30 ;
  input CustomerID -- BillingNR ;
cards;
CustomerID,CustomerName,Product,Price,BillingNR
"01234",Customer 1,Product1,Price1,"03541"
"52465",Customer 2,Product2,Price2,""
"23454",Customer 3,Product3,Price3,"035411236952154589632154"
;

Will result in values like: enter image description here