1
votes

I'd like to loop through the following data frame in order of the sum of the first 2 column values for each row, and then assign the third column value a number as a result of that.

Initial Table:

Col 1 Col 2 Col 3
20 0
5 0
20 0
0 10
20 0
10 0
20 40
15 0

The sums of columns 1 and 2 give:

20+0=20

5+0=5

20+0=20

0+10=10

20+0=20

10+0=10

20+40=60

15+0=15

Col 1 Col 2 Col 3
20 0 10
5 0 20
20 0 10
0 10 20
20 0 10
10 0 20
20 40 5
15 0 20

The 3 lowest sums get Col 3 value 20, the next 4 lowest get value 10, and the highest value gets 5.

2
What if your dataset has more or fewer rows than eight?Limey
and what if your 5 lowest sums all equal 10.. how do you handle ties?Wimpel

2 Answers

0
votes

This can be done using a single assignment rather than a loop, for example:

#Example data
df <- data.frame(col1 = c(20, 5, 20, 0, 21, 10, 20, 15), col2=c(0,0,0,10,0,0,40,0))

#Add dummy values
df$col3 <- NA

#Assign required values
df$col3[order(df$col1+df$col2)] <- rep(c(20,10,5), c(3,4,1))
df
#  col1 col2 col3
#1   20    0   10
#2    5    0   20
#3   20    0   10
#4    0   10   20
#5   21    0   10
#6   10    0   20
#7   20   40    5
#8   15    0   10
0
votes

Let's take the example you gave:

df <- data.frame(Col1 = c(20,5,20,0,20,10,20,15),
                Col2 = c(0,0,0,10,0,0,40,0))
colnames(df) <- c("Col 1", "Col 2")

We then can do this:

library(dplyr)
df <- df %>% 
  mutate(`Col 3` = `Col 1` + `Col 2`)

col3_values <- sort(df$`Col 3`)
  

df <- df  %>% 
  mutate(`Col 3` = case_when(`Col 3` <= col3_values[[3]] ~ 20,
                             `Col 3` > col3_values[[3]] & `Col 3` <= col3_values[[7]] ~ 10,
                             TRUE ~ 5))

Output:

  Col 1 Col 2 Col 3
1    20     0    10
2     5     0    20
3    20     0    10
4     0    10    20
5    20     0    10
6    10     0    20
7    20    40     5
8    15     0    10

Note that the last line isn't what you expected because the sum isn't one of the 3 smallest (you have a 5 and two 10 before).

But as Limey commented, this wont work if you have more than 8 rows. You will have to change the bounds where the given value is affected