Define a filter with variable in data step with do loop- SAS

Question

Good Morning, i've this problem.

there are 2 dataset

Dataset "ID Customer" where i have this:

id       |  Customer Name   |
-----------------------------
123456   | Michael One      |
123123   | George Two       |
123789   | James Three      |

and the second dataset named "transaction":

id       |  Transaction | Date
-----------------------------------
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018
123123   | Fuel         | 10NOV2018
123456   | Fuel         | 25NOV2018
123123   | Fuel         | 13NOV2018
123456   | Fuel         | 10DEC2018
123789   | Fuel         | 1NOV2018
123123   | Fuel         | 30NOV2018
123789   | Fuel         | 15DEC2018

the results that i want is to create 3 db like a 3 customer id that i've in the first Dataset named:

_01NOV2018_15NOV_123456_F
_01NOV2018_15NOV_123123_F
_01NOV2018_15NOV_123789_F

that contains:

For  _01NOV2018_15NOV_123456_F :
id       |  Transaction | Date
-----------------------------------
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018

For _01NOV2018_15NOV_123123_F :

id       |  Transaction | Date
-----------------------------------
123123   | Fuel         | 10NOV2018
123123   | Fuel         | 13NOV2018

For _01NOV2018_15NOV_123789_F

empty

I need to create a variable for a clause where in data step... how can i make this?

thanks for help! :)`

Why do you want to create lots of little datasets? This is almost always a bad idea. Look into using by-group processing instead if you want to process each ID separately. — user667489
In 90% of use cases this is not needed. Why do you want to split the data set? There's often much simpler ways and if you find yourself building little macros all over the place to loop over your small data sets you know you're doing it incorrectly. — Reeza

Richard Richard · Accepted Answer · 2019-01-31T14:47:07

The HASH OUTPUT method is the only way to create a dynamically named output dataset at DATA step runtime. Per the question comments you most likely do NOT want to split your original dataset into lots of content-named pieces. Regardless, such as process, in SAS, is known as splitting.

You are far better served learning how to apply a WHERE statement and BY group processing in both DATA steps, and PROC steps.

The wanted output appears to be segregated or categorized based on month halves. You might be best served by computing a new semimonth variable containing an appropriate categorical value, and then using that downstream, such as in a PROC PRINT.

data customers;
infile cards dlm='|';
attrib
  id length=8
  name length=$20
;
input id name ;
datalines;
123456   | Michael One      |
123123   | George Two       |
123789   | James Three      |
run;

data transactions;
infile cards dlm='|';
attrib
  id length=8
  transaction length=$10
  date length=8 format=date9. informat=date9.
;
input id transaction date;
datalines;
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018
123123   | Fuel         | 10NOV2018
123456   | Fuel         | 25NOV2018
123123   | Fuel         | 13NOV2018
123456   | Fuel         | 10DEC2018
123789   | Fuel         | 1NOV2018
123123   | Fuel         | 30NOV2018
123789   | Fuel         | 15DEC2018
run;

proc sort data=customers;
  by id;
proc sort data=transactions;
  by id date;

* merge datasets and compute semimonth;

data want;
  merge transactions customers;
  by id;

  semimonth = intnx('month',date,0) + 16 * (day(date) > 15);

  attrib semimonth
    format=date9.
    label="Semi-month"
  ;
run;


* process data by semimonth and id, restricting with where;

proc print data=want;
  by semimonth id;
  where semimonth = '01NOV2018'D;
run;

Define a filter with variable in data step with do loop- SAS

2 Answers