I have two dataframes df1 and df2 and I want to create a new column in df1 and set values in that column to 0 where rows in df1 are contained in df2. More specifically:
sample_data_1 = {'col1': ['80', '8080'], 'col2': ['0.0.0.0', '143.21.7.165']}
df1 = pd.DataFrame(data=sample_data_1)
sample_data_2 = {'col1': ['80', '8080', '1', '8888'], 'col2': ['0.0.0.0', '143.21.7.165', '1', '5.5.5.5'], 'col3': ['1','2','3']}
df2 = pd.DataFrame(data=sample_data_2)
col1 col2
0 80 0.0.0.0
1 8080 143.21.7.165
col1 col2 col3
0 80 0.0.0.0 1
1 8080 143.21.7.165 2
2 1 1 3
3 8888 5.5.5.5 4
I would like to add a column to df1 and set those values to 0 where col1 and col2 in df1 match col1 and col2 in df2. The resultant dataframe should look like:
col1 col2 score
0 80 0.0.0.0 0
1 8080 143.21.7.165 0
When the dataframe sizes are the same, I can do a straight comparison using .loc function and logical and's, but when they have different shapes I get "unable to compare series" exceptions. Thoughts?
Thanks for the help!