Loop through data frame and match/populate rows with column values

Question

df1

...other columns...MON TUE WED THU FRI SAT SUN Total
                   8.5 8.5 8.5 8.5 8.5 6.0 0.0 0.0
                   9.0 9.0 9.0 9.0 9.0 6.0 6.0 0.0
                   6.0 7.0 7.0 7.0 7.0 5.0 0.0 0.0
                   ...about 1400 records/rows of data in df1

df2

Day Hours
FRI   0
SAT   0
SUN   0
MON   0
TUE   0
WED   0
THU   0
FRI   0
SAT   0
SUN   0
MON   0
TUE   0
WED   0
...Will keep going until end (28-31 days worth will be listed - all days in a month)

Step 1

I need to loop through all lines in df2, matching all records in column 'Day' with df1 columns MON-SUN...adding the corresponding hours in df1 to the 'Hours' column in df2...using only the first row in df1 until all of df2 has been looped through and filled...Output example below

df2 output

Day   Hours
FRI   8.5
SAT   6.0
SUN   0
MON   8.5
TUE   8.5
WED   8.5
THU   8.5
FRI   8.5
SAT   6.0
SUN   0
MON   8.5
TUE   8.5
WED   8.5
...will continue all the way until last row until all data is filled from 1st row in df1 (repeating itself, just matching the right values)

2nd step

After df2 has been looped and filled - then sum of 'Hours' column in df2 and place in df1 'total' column

df1 output

...other columns...MON TUE WED THU FRI SAT SUN Total
                   8.5 8.5 8.5 8.5 8.5 6.0 0.0 88.5
                   9.0 9.0 9.0 9.0 9.0 6.0 6.0
                   6.0 7.0 7.0 7.0 7.0 5.0 0.0

This then repeats until all rows in df1 has been looped and has gone through the same process...so probably a double loop of some sort is needed with a match function. I am struggling to find any solutions to this

Code used

row_df1 <- 1
row_df2 <- 1

for (row_df2 in seq(1,nrow(Calendar$Jan))) {
 for (day in week) {
  if (Calendar$Jan[row_df2, 'Day'] == day) {
   Calendar$Jan[row_df2,'Hours'] <- Calctable[row_df1,day]
   row_df2 <- row_df2 + 1
    }
  }
 }

Error Message

 Error in for (day in week) { : invalid for() loop sequence

Many thanks

randomchars42 randomchars42 · Accepted Answer · 2021-01-16T16:24:58

library(dplyr)

df1 <- data.frame(
  MON = c(1,2,3),
  TUE = c(5,6,7),
  WED = c(8,9,10),
  THU = c(11,12,13),
  FRI = c(14,15,16),
  SAT = c(17,18,19),
  SUN = c(20,21,22))

df2 <- data.frame(
  Day = c('THU', 'FRI', 'SAT', 'SUN', 'MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN', 'MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN'),
  Hours = 0
)

Example df1: (sorry, I didn't take the time to recreate you exact data, please follow through)

    MON   TUE   WED   THU   FRI   SAT   SUN
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     5     8    11    14    17    20
2     2     6     9    12    15    18    21
3     3     7    10    13    16    19    22

Example df2:

   Day   Hours
   <chr> <dbl>
 1 THU       0
 2 FRI       0
 3 SAT       0
 4 SUN       0
 5 MON       0
 6 TUE       0
 7 WED       0
 8 THU       0
 9 FRI       0
10 SAT       0
11 SUN       0
12 MON       0
13 TUE       0
14 WED       0
15 THU       0
16 FRI       0
17 SAT       0
18 SUN       0

Step 1: This should be the algorithm you`re looking for to sort df2 into df1 in the way you described it.

row_df2 <- 1

for (row_df1 in seq(1,nrow(df1))) {
  for (day in c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN'))
    if (df2[row_df2, 'Day'] == day) {
      df2[row_df2,'Hours'] <- df1[row_df1,day]
      row_df2 <- row_df2 + 1
    }
}

Step 2: now you could sum up the values in df1, e.g. using dplyr:

df1 <- df1 %>%
  mutate(
    Sum = MON + TUE + WED + THU + FRI + SAT + SUN
  )

df1:

# A tibble: 3 x 8
    MON   TUE   WED   THU   FRI   SAT   SUN   Sum
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     5     8    11    14    17    20    76
2     2     6     9    12    15    18    21    83
3     3     7    10    13    16    19    22    90

df2:

# A tibble: 18 x 2
   Day   Hours
   <chr> <dbl>
 1 THU      11 <- row 1: THU
 2 FRI      14 <- row 1: FRI
 3 SAT      17 <- ...
 4 SUN      20
 5 MON       2 <- row 2: MON
 6 TUE       6 <- ....
 7 WED       9
 8 THU      12
 9 FRI      15
10 SAT      18
11 SUN      21 <- row 2: SUN
12 MON       3 <- row 3: MON
13 TUE       7
14 WED      10
15 THU      13
16 FRI      16
17 SAT      19
18 SUN      22

Is there no identifier like Date in both tables? This would make it much more robust. You could then match by date without relying on the right day to start with.

Edit 1: Updated after testing and removal of some errors.

Edit 2: Highlighted which value from df1 will land in df2. I just used different example data than you (I didn't want to type it all in).

Edit 3: Used data.frame instead of tibble in example data to demonstrate it should work as well.

Edit 4: Is this what you want?

row_df1 <- 1
row_df2 <- 1

for (row_df2 in seq(1,nrow(df2))) {
  for (day in week) {
    if (df2[row_df2, 'Day'] == day) {
      df2[row_df2,'Hours'] <- df1[row_df1,day]
      row_df2 <- row_df2 + 1
    }
}

df2

will lead to:

   Day Hours
1  THU    11 <- row 1: THU
2  FRI    14
3  SAT    17
4  SUN    20
5  MON     1
6  TUE     5
7  WED     8
8  THU    11 <- row 1: THU
9  FRI    14
10 SAT    17
11 SUN    20
12 MON     1
13 TUE     5
14 WED     8
15 THU    11 <- row 1: THU
16 FRI    14
17 SAT    17
18 SUN    20

Edit 5: Seems there is a { missing:

for (row_df2 in seq(1,nrow(Calendar$Jan))) {
 for (day in week) {                                       # <- HERE
  if (Calendar$Jan[row_df2, 'Day'] == day) {
   Calendar$Jan[row_df2,'Hours'] <- Calctable[row_df1,day]
   row_df2 <- row_df2 + 1
    }
  }

Edit 6:

In Edit 5 I assigned week <- c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN') but forgot to mention it. It should have looked like (no special built-in variables here):

week <- c('MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT', 'SUN')

for (row_df2 in seq(1,nrow(Calendar$Jan))) {
 for (day in week) {
  if (Calendar$Jan[row_df2, 'Day'] == day) {
   Calendar$Jan[row_df2,'Hours'] <- Calctable[row_df1,day]
   row_df2 <- row_df2 + 1
  }
 }
}

in case you re-use week at some other point in your code. I used it for testing the loop and mixed it up in the previous version of this answer.

Loop through data frame and match/populate rows with column values

1 Answers