I have two data frames in R with same columns & data types in each. Some columns are text based & others are numbers & some others are dates. However, same columns have the same sort of data in both data frames. The unique identifier is also the same in both i.e., the primary keys match.
Now, I want to create a third data frame which essentially captures for each primary key, what is the difference between the values in DF1 and DF2 for the corresponding columns. When the columns to be checked is character, we can simple say 1 or 0 indicating a difference. When it is numeric, we can have the difference amount being captured or perhaps simply 1 or 0 again.
What's the most efficient way to do this in R? I do not want to do a row by row comparison as it is slow. Column by column comparison would be fine but that too seems like too much manual oversight required. Ideally, looking for a few data frame level functions that would help me do this.
Reproducible & editable example:
Dataframe1:
ID val1 date1 chrval1 val3
A1 400 3/4/2017 DR9912YS -43
A2 230 3/4/2017 ER9F4YS -43
A3 500 31/2/2015 FFR99S -49
Dataframe2:
ID val1 date1 chrval1 val3
A1 400 3/4/2017 DR9912YS -43
A2 400 3/4/2017 DR9912YS -43
A3 400 31/4/2017 DR9912YS -43
Ideally this is what I am looking for:
Difference Dataframe:
ID val1 date1 chrval1 val3
A1 0 0 True 0
A2 170 0 False 0
A3 -100 0/2/2 False 5
identical
that might be of some help. Have a look here – DJJ