There isn't a direct way to do this with a MERGE. This is one example where the SQL method is clearly superior to any SAS data step methods, as anything you do will take much more code and possibly more time.
However, depending on the data, it's possible a few approaches may make sense. In particular, the format merge.
If data1
is fairly small (even, say, millions of records), you can make a format out of it. Like so:
data fmt_set;
set data1;
format label $8.;
start=first; *set up the names correctly;
end=last;
label='MATCH';
fmtname='DATA1F';
output;
if _n_=1 then do; *put out a hlo='o' line which is for unmatched lines;
start=.; *both unnecessary but nice for clarity;
end=.;
label='NOMATCH';
hlo='o';
output;
end;
run;
proc format cntlin=fmt_set; *import the dataset;
quit;
data want;
set data2;
if put(value,DATA1F.)="MATCH";
run;
This is very fast to run, unless data1
is extremely large (hundreds of millions of rows, on my system) - faster than a data step merge, if you include sort time, since this doesn't require a sort. One major limitation is that this will only give you one row per data2
row; if that is what is desired, then this will work. If you want repeats of data2
then you can't do it this way.
If data1
may have overlapping rows (ie, two rows where start/end overlap each other), you also will need to address this, since start/end aren't allowed to overlap normally. You can set hlo="m"
for every row, and "om" for the non-match row, or you can resolve the overlaps.
I'd still do the sql join, however, since it's much shorter to code and much easier to read, unless you have performance issues, or it doesn't work the way you want it to.
data1
unique? – Jon Clements