0
votes

I have two dataframes (df1 and df2) and would like to subtract only numeric columns between the two dataframes [(df2-df1)/df2] and determine the percentage difference and create an output dataframe = df3 using R. Non-numeric columns will be retained as such.

df1 and df2 have the same structure and same column names.

df1:

colA colB colC   ...   colZ
mean   10   20      stringA
count  30   50      stringB

df2:

colA colB colC   ...  colZ
mean    5   25     stringA
count  60   50     stringB

df3:

colA  colB colC   ...   colZ
mean  -100   20      stringA
count   50    0      stringB

I tried this and didn't work:

 df2[,2:3] = (df2[,2:3] - df1[,2:3])/df2[,2:3]

could someone please help with this?

1
how are expecting -50 in colB in df3?7kemZmani
@7kemZmani oops will edit it - sorry!thecoder
@7kemZmani any idea how I could approach this? thank you.thecoder
Please choose one language. R or Python.John Zwinck
@JohnZwinck R would be my preference.thecoder

1 Answers

3
votes

We can subset the numeric columns and then perform the operation

num_cols <- c("colB", "colC")
df3 <- (df2[num_cols] - df1[num_cols])/df2[num_cols] * 100
df3

#  colB colC
#1 -100   20
#2   50    0

To get other non numeric columns we can use setdiff and then cbind

cbind(df1[setdiff(names(df1), num_cols)], df3)

#   colA    colZ colB colC
#1  mean stringA -100   20
#2 count stringB   50    0