0
votes

I have a data frame that looks like iris for example. I want to create another column called C1 which is a multiplication of all setosa Sepal.Length by 2.5, all Versicolor Sepal.Length by 3.5 and all Virginica Sepal.Length by 4.5. Could someone help me with the coding, please?

Expected out

Sepal.Length Sepal.Width Petal.Length Petal.Width Species C1 5.1 3.5 1.4 0.2 setosa 5.1*2.5 4.9 3.0 1.4 0.2 setosa 4.9*2.5 '' '' '' '' '' ''

       6.4         3.2          4.5         1.5    versicolor    6.4*3.5
       6.9         3.1          4.9         1.5    versicolor    6.9*3.5
        ''          ''           ''          ''        ''          ''

       7.1         3.0          5.9         2.1     virginica     7.1*4.5
       6.3         2.9          5.6         1.8     virginica     6.3*4.5
        ''          ''           ''          ''        ''            ''
2
if any of the below worked, can you mark an answer as correct for future reference pls? - Aaron Gorman

2 Answers

1
votes

An option is to create a named vector to replace the values based on the 'Species' and multiply with 'Sepal.Length'

library(dplyr)
iris <- iris %>% 
          mutate(C1 = Sepal.Length * set_names(c(2.5, 3.5, 4.5),
             c("setosa", "versicolor", "virginica"))[as.character(Species)])
head(iris, 3)
#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    C1
#1          5.1         3.5          1.4         0.2  setosa 12.75
#2          4.9         3.0          1.4         0.2  setosa 12.25
#3          4.7         3.2          1.3         0.2  setosa 11.75
1
votes

An alternative uses the merge/join methodology to determine which of 2.5, 3.5, 4.5 to use as the multiplier.

iris2 <- merge(iris,
               data.frame(Species=c("setosa", "versicolor", "virginica"), mult=c(2.5,3.5,4.5)),
               by = "Species")
head(iris2)
#   Species Sepal.Length Sepal.Width Petal.Length Petal.Width mult
# 1  setosa          5.1         3.5          1.4         0.2  2.5
# 2  setosa          4.9         3.0          1.4         0.2  2.5
# 3  setosa          4.7         3.2          1.3         0.2  2.5
# 4  setosa          4.6         3.1          1.5         0.2  2.5
# 5  setosa          5.0         3.6          1.4         0.2  2.5
# 6  setosa          5.4         3.9          1.7         0.4  2.5

From this, it's trivial to calculate:

head(iris2$mult * iris2$Sepal.Length, n = 10)
#  [1] 12.75 12.25 11.75 11.50 12.50 13.50 11.50 12.50 11.00 12.25

and store that in a column or elsewhere.