0
votes

My file looks like this:

"John","Smith","Blah, John B","1234 N Something St","New Orleans","Orleans","LA",70116,"555-555-5555","666-555-1234","[email protected]","http://www.something.com"
"John2","Smith2","Blah2, John2 B","4567 S Blah St","New Orleans2","Orleans2","LA2",70116,"777-555-5555","777-555-1234","[email protected]","http://www.something2.com"

The file is quite large but I am keeping only two lines here for simplicity.

My SAS code is:

data sample;
    infile '/folders/myfolders/samplefile2.csv' dsd dlm="," missover;
    input first_name$ last_name$ company_name$ address$ city$ county$ state$ zip$ phone1$ phone2$ email$ web$;
run;

proc print data=sample;
run;

The output I am getting is:

                  c
                  o
    f             m
    i     l       p
    r     a       a
    s     s       n        a
    t     t       y        d                 c                  p        p
    _     _       _        d                 o      s           h        h        e
    n     n       n        r        c        u      t           o        o        m
O   a     a       a        e        i        n      a    z      n        n        a        w
b   m     m       m        s        t        t      t    i      e        e        i        e
s   e     e       e        s        y        y      e    p      1        2        l        b

1 John  Smith  Blah, Jo 1234 N S New Orle Orleans  LA  70116 555-555- 666-555- jsmith@m http://w

My question is why am I unable to read the data properly or why is it not even reading the second line?

1
I've edited your question so that it no longer contains personal information. Please don't post other people's personal information to StackOverflow (or anywhere else on the internet for that matter). Take the time to create some dummy information instead. - Robert Penridge

1 Answers

0
votes

The first obvious problem is that you are ignoring the delimiters by reading the first 15 characters into the FIRST_NAME variable. That will mess up the rest of the line.

You should use list style INPUT statement instead of formatted style when reading from a delimited file. Also I find that my programs are much clearer if I DEFINE my variables instead of forcing SAS to guess what I want based on how I first use them. So let's convert your program.

data sample;
  infile '/folders/myfolders/samplefile2.csv' dsd dlm="," TRUNCOVER;
  LENGTH first_name $15 last_name $8 company_name $8 
         address $8 city $8 county $8 state $8 zip $8 
         phone1 $8 phone2 $8 email $8 web $8
  ;
  input first_name -- web ;
run;

This also shows that you have defined many of your variables (like EMAIL and WEB) as being way too short for the values that they will need to hold.

As to the second line issue it is most likely caused by not having proper end-of-line characters between the lines. Since it looks like you are reading on Unix then that is probably because your end of line character is CR (carriage return or '0D'x) instead of LF (linefeed or '0A'x). Try adding TERMSTR=CR to your INFILE statement.