0
votes

I have two data frames that I want to combine based on if two values from two different columns in one data frame are in the other data frame's column

1
I basically need the freq column in data frame b by seeing if the full sql phrase contains the text in data frame A - aa710
I want to know if it is possible to merge these data frames as I have shown above. so not merging by coloumn name but if theres some way to detect if the table name and the column name exist in the full sql and then joining it together - aa710
i basically want to cbind based on table and column names to get everything in one data frame - aa710

1 Answers

3
votes

Here's a fuzzyjoin / stringr solution:

library(fuzzyjoin); library(stringr)
fuzzy_inner_join(df_b, df_a, 
                 by = c("full_sql" = "column_name"), 
                 match_fun = str_detect)

#                                                                        full_sql index          Table_name             column_name freq
#1 UPDATE DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG=:C19, TERMINATION_DATE=:C21  3694 DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG    2
#2 UPDATE DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG=:C19, TERMINATION_DATE=:C21  3694 DDADMIN.USER_MASTER        TERMINATION_DATE    2
#3  UPDATE DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG='N', TERMINATION_DATE=NULL  3695 DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG    2
#4  UPDATE DDADMIN.USER_MASTER SETTERMINATED_USER_FLAG='N', TERMINATION_DATE=NULL  3695 DDADMIN.USER_MASTER        TERMINATION_DATE    2