0
votes

Please bear with me as I try to get my problem across as clearly as possible.

I am trying to merge two variables, but I am getting an error that says "Invalid numeric data, DateLastContact='06/30/2005' , at line 1036 column 4." This message repeats for multiple observations.

I am importing an excel file which has two date columns. SAS reads one of the columns as numeric, the other as character (DateLastContact). I have tried to change DateLastContact into numeric in both SAS and excel and it's still coming back with an Invalid numeric data error when I try to merge. Here is my code:

PROC IMPORT DATAFILE = "C:\Users\bennetde\Documents\SAS\BCcombined3.xls"     
DBMS = XLS OUT = SASBC REPLACE;
RUN;

proc contents data = SASBC OUT = Check;
RUN;

PROC PRINT DATA = Check;
RUN;

PROC CONTENTS returns:

DateLastContact $ 30 0 $ 30 
DateLastFollowup_ContactOrDeath MMDDYY 10 0   0 

So I tried in SAS:

DATA SASBC;
SET SASBC;
char_DateLastContact = input(DateLastContact, MMDDYY10.);
RUN;

But it did not work. I have also tried: "char_DateLastContact = input(DateLastContact, 12.);

Here is my code:

PROC IMPORT DATAFILE = "C:\Users\bennetde\Documents\SAS\BCcombined3.xls" 
DBMS = XLS OUT = SASBC REPLACE;
RUN;
DATA SASBC;
SET SASBC;
drop DateLastContact DateLastFollowup_years;
if DateLastContact = . then char_DateLastContact =     
put(DateLastFollowup_years, 20.);
else char_DateLastContact = DateLastContact;
RUN;
proc print data = SASBC;
RUN;

Here is a sample of the error message after trying to change the variable from numeric to character:

1008  DATA SASBC;
1009  SET SASBC;
1010  char_DateLastContact = input(DateLastContact, MMDDYY10.);
1011  format DateLastContact;
1012  RUN;

NOTE: Invalid argument to function INPUT at line 1010 column 24.
(I can't give you the lines after this unless I change patient info, but 
this error message repeats for several observations.)

And here is the original error:

NOTE: Invalid numeric data, DateLastContact='09/21/2007' , at line 968 
column 4

Any help would be appreciated.

1

1 Answers

0
votes

So from your PROC CONTENTS output you have a character variable DateLastContact of length $30 that you want to convert to a date.

DateLastContact $ 30 0 $ 30
DateLastFollowup_ContactOrDeath MMDDYY 10 0 0

Since Excel did not consider DateLastContact to be a date then it is probable that there are values in there that are not going to convert. Also since the length is $30 instead of $10 that is all that would be needed for date strings in with four digit years and two delimiters you probably have some strange values in that field.

Your first attempt looks like the right approach, even if your names are a little backwards. Try converting the values and asking SAS to display the values that do not convert. If you want to suppress the error messages that SAS will normally generate you can use the ?? informat modifier. You can try using the ANYDTDTE informat and SAS can adjust to many different date formats.

data want ;
  set SASBC ;
  new_DateLastContact = input(DateLastContact,??ANYDTDTE30.);
  format new_DateLastContact mmddyy10. ;
  if DateLastContact ne ' ' and new_DateLastContact = . then put DateLastContact $char30.;
run;

If your values actually have the quotes around them then perhaps you can use the DEQUOTE() function to remove the quotes and then try converting the resulting string to a date using the INPUT() function.

  new_DateLastContact = input(dequote(strip(DateLastContact)),??ANYDTDTE30.);

If the values look like integers such as

42598 40987 39686 39685 40534 39939 39486 38708 39790

then it seems that the SAS has copied over the number that Excel stores for dates. SAS and Excel use different starting dates (and also differences about whether to count from 0 or 1 and whether 1900 was a leap year) so you can convert those values to SAS dates by just subtracting the right amount.

new_DateLastContact = input(DateLastContact,30.) - ('01JAN1960'd - '01JAN1900'd +2);