2
votes

I have two data frames:

df1 <- data.frame(Values=c(0.01,0.05), row.names=c("X", "Y"))
df1
  Values
X   0.01
Y   0.05

df2 <-data.frame(c(0,1,1), c(1,0,0), c(1,1,1))
colnames(df2) <- c("X","Y","Z")

df2
  X Y Z
1 0 1 1
2 1 0 1
3 1 0 1

I wish to perform a rowwise operation on df2, where I multiply every column in df2 with its corresponding row in df1 and then perform a summation.

For example, for row 1 of df2, I wish to calculate:

df2 %>% rowwise %>% mutate(newVAL=(df1["X",]*df2[1,"X"])+(df1["Y",]*df2[1,"Y"]))

while excluding columns that don't match (rows in df1) or have NAs.

I have several thousands of rows in df1 and several thousands of rows and columns in df2.

Any help is much appreciated!!

PS. I have implemented this in Perl using hashes and was using the system() call to perform these calculations within an Rmarkdown document. In order to keep it completely reproducible, I am trying to redo it in R. Happy to share the Perl code if necessary.

Thanks.

3
For this particular case of element-wise multiplication and, then. summation, you could use tcrossprod(as.matrix(df2[, rownames(df1)]), t(df1))alexis_laz

3 Answers

3
votes

If I understand correctly, it looks like you need sweep.

df3 <- sweep(df2[, rownames(df1)], 2, t(df1), '*')
df3$total <- rowSums(df3)
3
votes

Here's an attempt in base R matching the rows to the columns between the two sets:

rowSums(
  sweep(df2,
        MARGIN=2,
        STATS=df1$Values[match(colnames(df2), rownames(df1))],
        FUN=`*`),
  na.rm=TRUE
)
#[1] 0.05 0.01 0.01
1
votes

We can also use rep to make the lengths same to multiply and then get the rowSums. It will be more efficient to use rep as it is faster

rowSums(df2[rownames(df1)] * rep(df1$Values, each = nrow(df2)))
#[1] 0.05 0.01 0.01

Or using the tidyrverse packages

library(dplyr)
library(purrr)
df2 %>% 
     select_(.dots = rownames(df1)) %>% 
     map2(df1$Values, `*`) %>%
     reduce(`+`)
#[1] 0.05 0.01 0.01

Update

If we need it as a column,

df2 %>% 
    select_(.dots = rownames(df1)) %>%
    map2(df1$Values, `*`) %>%
    reduce(`+`) %>%
    mutate(df2, total = .)
#  X Y Z total
#1 0 1 1  0.05
#2 1 0 1  0.01
#3 1 0 1  0.01