How do you check if a cell in one dataset exists in another dataset's variable/column?

Question

I have two datasets, call them A and B; here's what they'd look like:

A:

  Var1              Var2
"Hello21"           "ID1"
"Hello5"            "ID2"
"Hello30"           "ID3"
   .                "ID4"

B:

   Var1
"Hello30"
"Hello39"
"Hello41"

Of course, there are hundreds of thousands of observations instead of just three. I want to be able to take a look at A.var1 and see if each of the observations in A.var1 exist in B.var1. If it doesn't, I want to replace A.var1 with a blank. I also want to retain rows where A.var1 is missing.

Final output:

  Var1              Var2
   .                "ID1"
   .                "ID2"
"Hello30"           "ID3"
   .                "ID4"

Do you have any idea what kind of SQL query I can write to achieve this? I'll be writing this in a PROC SQL statement in SAS.

I realized what I asked was a silly question. I forgot to add the fact that there will be missing observations in table A, which I'll also want to keep as missing in the final output. I'll add edits to the question. — navkast

ah_hau ah_hau · Accepted Answer · 2014-09-03T01:19:39

SELECT B.Var1, A.Var2
FROM A LEFT OUTER JOIN B ON A.Var1 = B.Var1

This should get your desired result.

How do you check if a cell in one dataset exists in another dataset's variable/column?

A:

B:

Final output:

2 Answers