1
votes

I have two dataframes and I want to "combine them by a certains conditions". My first dataframe have average Sms(avgSms) of day of week(1=Sunday, 2=Monday, 3=Tuesday...) , hour and min:

df1:
       I   | Day_of_week | hour | min |    AvgSms
       #1          1          0     0      12
       #2          1          0     30     20
       #3          1          1     0      19
       #4          1          1     30     11
       #5          1          2     0      12
       #6          1          2     30     7
       ...        ...       ...    ...   ....
       #10         2          0     0      14
       #11         2          0     30     20
       #12         2          1     0      19
       #13         2          1     30     11
       ...        ...       ...    ...   ....
       #222        7         23     30     13

The second dataframe have time and sms, something like this:

 df2:       Time           Sms
     1 2012-01-01 00:00:00  10
     2 2012-01-01 00:30:00  11
     3 2012-01-01 01:00:00  13
     4 2012-01-01 01:30:00  10
     5 2012-01-01 02:00:00  7
     6 2012-01-01 02:30:00  3
     7 2012-01-01 03:00:00  3
     8 2012-01-01 03:30:00  2
      .......................
    400 2015-12-31 23:30:00  16

I want to add to the dataframe 2, the respective avgSms of the dataframe1 depending on the variable time whether it is a second or third and x hour and y and min.

I want something like this:

         Time           Sms     avg
     1 2012-01-01 00:00:00  10  12 --> 2012-01-01 was Sunday=1, h=0 and min=0
     2 2012-01-01 00:30:00  11  20
     3 2012-01-01 01:00:00  13  19
     4 2012-01-01 01:30:00  10  11
     5 2012-01-01 02:00:00  7   ..
     6 2012-01-01 02:30:00  3
     7 2012-01-01 03:00:00  3
     8 2012-01-01 03:30:00  2
      .......................
    400 2015-12-31 23:30:00  16
1

1 Answers

0
votes

Consider merge after date/time formatting in df2 and use such fields as the by values:

Data

txt=' I   Day_of_week hour min     AvgSms
1          1          0     0      12
2          1          0     30     20
3          1          1     0      19
4          1          1     30     11
5          1          2     0      12
6          1          2     30     7
10         2          0     0      14
11         2          0     30     20
12         2          1     0      19
13         2          1     30     11
222        7         23     30     13'

df1 <- read.table(text=txt, header = TRUE)

txt = '
    Time           Sms
1 "2012-01-01 00:00:00"  10
2 "2012-01-01 00:30:00"  11
3 "2012-01-01 01:00:00"  13
4 "2012-01-01 01:30:00"  10
5 "2012-01-01 02:00:00"  7
6 "2012-01-01 02:30:00"  3
7 "2012-01-01 03:00:00"  3
8 "2012-01-01 03:30:00"  2
'
df2 <- read.table(text=txt, header = TRUE)

Date/Time Conversion

df2$Date <- as.Date(df2$Time, format="%Y-%m-%d")
df2$Time <- as.POSIXct(df2$Time)

df2$Day_of_week <- as.integer(strftime(df2$Date,format="%w")) + 1
df2$hour <- as.integer(strftime(df2$Time,format="%H"))
df2$min <- as.integer(strftime(df2$Time,format="%M"))

Merge

merge(df2, df1[-1], by=c("Day_of_week", "hour", "min"), all.x=TRUE)

#   Day_of_week hour min                Time Sms       Date AvgSms
# 1           1    0   0 2012-01-01 00:00:00  10 2012-01-01     12
# 2           1    0  30 2012-01-01 00:30:00  11 2012-01-01     20
# 3           1    1   0 2012-01-01 01:00:00  13 2012-01-01     19
# 4           1    1  30 2012-01-01 01:30:00  10 2012-01-01     11
# 5           1    2   0 2012-01-01 02:00:00   7 2012-01-01     12
# 6           1    2  30 2012-01-01 02:30:00   3 2012-01-01      7
# 7           1    3   0 2012-01-01 03:00:00   3 2012-01-01     NA
# 8           1    3  30 2012-01-01 03:30:00   2 2012-01-01     NA