0
votes

I am using 64-bit SAS 9.4 to import a file from 32-bit Excel. Many of my Excel columns contain both dates and lists of dates, like this:

SeptemberDatesCol
9/13 & 9/27
9/13 & 9/26
9/01 & 9/10 & 9/21
9/23
9/30
9/30
9/28

I'm not the owner of the file and it is updated regularly, so all major manipulation and formatting needs to occur in SAS. After import, I'm able to change it into a normal list of one date per row, but the import itself is not importing the dates correctly. My import code:

PROC IMPORT OUT=Raw_Import  DATAFILE= "J:\filename.xlsx"  DBMS=XLSX REPLACE;  
SHEET="Sheetname";  
GETNAMES=YES;     
RUN;

SeptemberDatesCol imports as a character variable (as desired), except that for some of the rows with a single date, the date is imported with Excel's raw unformatted date convention values:

SeptemberDatesCol
9/13 & 9/27
9/13 & 9/26
9/01 & 9/10 & 9/21
9/23
42643
42643
42641

What can I do differently to get the column to import properly? When I use DBMS=EXCELCS, I obtain a 'failed to connect to the server' error. DBMS=EXCEL does not function given my work's software setup (64-bit SAS 9.4 with 32-bit Excel). Thank you in advance for any advice you can provide.

1

1 Answers

0
votes

When you import a column in an XLSX file that has mixed text and numbers then SAS will import the numbers in raw format. You can convert the strings that look like raw dates from Excel's base date to SAS's base date by using simple arithmetic. So if you subtract the difference in days between the base date both systems use and also subtract one because SAS counts from 0 and Excel counts from 1 and also subtract one because Excel thinks 1900 was a leap year you get.

sasdt = exceldt - ('01JAN1960'd - '01JAN1900'd +2) ;

But since '01JAN1960'd is just zero you could simplify to

sasdt = exceldt + '01JAN1900'd - 2 ;

Now you just need a little logic to tell the difference between the numbers and the strings. For example you could test if the string converts into a number between 1 ('01JAN1900'd) and 43,100 ('31DEC2017'd) and if it does convert it to a string in MM/DD/YYYY format.

if 0 <= input(SeptemberDatesCol,??32.) <= 43100 then
  SeptemberDatesCol = put(input(SeptemberDatesCol,??32.),mmddyys10.) 
;

So your example data will get transformed into:

52   data have ;
53     input SeptemberDatesCol $30. ;
54     put SeptemberDatesCol= @ ;
55     if 0 <= input(SeptemberDatesCol,??32.) <= 43100 then
56         SeptemberDatesCol = put(input(SeptemberDatesCol,??32.)+'01JAN1900'd-2,mmddyys10.)
57     ;
58     put '-> ' SeptemberDatesCol ;
59
60   cards;

SeptemberDatesCol=9/13 & 9/27 -> 9/13 & 9/27
SeptemberDatesCol=9/13 & 9/26 -> 9/13 & 9/26
SeptemberDatesCol=9/01 & 9/10 & 9/21 -> 9/01 & 9/10 & 9/21
SeptemberDatesCol=9/23 -> 9/23
SeptemberDatesCol=42643 -> 09/30/2016
SeptemberDatesCol=42643 -> 09/30/2016
SeptemberDatesCol=42641 -> 09/28/2016