0
votes

I have a dataset with a variable, start date, which takes a lot of different values using a character format.

I have split the dataset up using the values in the start date column, because I want to transform the date variable into two new columns with two different date values:

if length(Startvalue) = 6 then output  a6;
else if length(Startvalue) = 8 then output a8 ;
else if length(Startvalue) = 1 then output a1;
else output a_other;

One value is written like this: 090209 (DDMMYY)

I want to format the variable into both a date9. format and a DDMMYYD10. format like this:

FORMAT  Startvalue2 date9.;
format Startvalue3 DDMMYYD10.;

I got it to work with this approach in a data step:

FORMAT  Startdato2 date9.;
format startdato3 DDMMYYD10.;
Startdato2 = INPUT(PUT(Startdato,6.),DDMMYY6.);
Startdato3 = INPUT(PUT(Startdato,6.),DDMMYY6.);

Another value is written like this: 15-08-17 (DD-MM-YY) I also want the two formats on this value, like this:

FORMAT  Startvalue2 date9.;
format Startvalue3 DDMMYYD10.;

BUT here I cannot use a copy of my expression from above:

FORMAT  Startvalue2 date9.;
format Startvalue3 DDMMYYD10.;
Startvalue2 = INPUT(PUT(Startvalue,8.),DDMMYYDw.);
Startvalue3= INPUT(PUT(Startvalue,8.),DDMMYYDw.);

Do you know why? And how I can get the value transformed into a date9. format and a DDMMYYD10 format?

Kind regards

Maria

1

1 Answers

1
votes

You do not need to convert the values to numeric with a put statement when you are using input. input's goal is to take a character input and turn it into a numeric value for SAS to do math with. How I always remember it:

  • Input: Convert an input value into a numeric value
  • Put: Output a numeric value as a character value

A simpler solution you can try is using the anydtdte. informat. It is capable of reading any the following informats:

  • Date
  • Datetime
  • DDMMYY
  • JULIAN
  • MDYAMPM
  • MMDDYY
  • MMxYY
  • MONYY
  • TIME
  • YMDDTTM
  • YYMMDD
  • YYQ
  • YYxMM
  • Month, Day, Year

For example:

%let default_datestyle = %sysfunc(getoption(datestyle));

options datestyle=DMY;

data want;
    set have;

    Startdato2  = INPUT(Startdato, anydtdte.);
    Startvalue2 = INPUT(Startvalue, anydtdte.);

    Startdato3  = Startdato2;
    Startvalue3 = Startvalue2;

    <rest of code>;

    format Startvalue2 date9.
           Startvalue3 ddmmyyd10.
           Startdato2  date9.
           Startdato3  ddmmyyd10.
    ;
run;

options datestyle = &default_datestyle;

Give that informat a try and see if it is able to get everything. If not, you can account for those special cases with some conditional logic.