1
votes

Elementary question:

I'm trying to subset a vector of a data frame based on a vector of dates that correspond with the vector that I wish to subset. Consider the following data frame as an example:

     Date    Time Axis1 Day Sum.A1.Daily
1 6/12/10 5:00:00    20   1           NA
2 6/12/10 5:01:00    40   1           NA
3 6/12/10 5:02:00    50   1           NA
4 6/13/10 5:03:00    10   2           NA
5 6/13/10 5:04:00    20   2           NA
6 6/13/10 5:05:00    30   2           NA

I want to fill the column to the right with the sum of values for each day. Basically, (1:3,5) should = 110, and (4:6,5) should = 60.

I know there are many ways to do this that are smarter/faster/better than what I'm attempting to do (e.g., my date variable is a factor split into "levels" that I don't know how to access), but I'm trying to build my skills from the ground up, and want to figure out how to:

  1. Take a subset of data$Axis1 that will only grab the values for the 1st day
  2. Take a subset of the values of data$Axis1 that will only grab the values for the 2nd day
  3. Sum the values for each day, and place them in column 5, overwriting the "NA"

I successfully performed a function similar to this to auto-fill-in the "Day" vector, which was originally full of "NA" values (below). But I'm getting stuck as I think about how to a) subset with dates, and b) sum while subsetting.

Thanks in advance for your help - also, let me know if my question could be clearer/I'm violating cardinal stackoverflow rules. I'm very new to R and the coding community in general; I appreciate your help!

dates <-c("6/12/10","6/13/10")
counts <- c(1:2)
x <- nrow(data)

for (i in 1:x) {
      for (j in 1:12) {
            if (data[i,1] == dates[j]) {
                  data[i,4] <- counts[j]
            }
      }
}
2

2 Answers

3
votes

Using ave :

transform(dat,Sum.A1.Daily=ave(dat$Axis1,dat$Date,FUN=sum))
     Date    Time Axis1 Day Sum.A1.Daily
1 6/12/10 5:00:00    20   1          110
2 6/12/10 5:01:00    40   1          110
3 6/12/10 5:02:00    50   1          110
4 6/13/10 5:03:00    10   2           60
5 6/13/10 5:04:00    20   2           60
6 6/13/10 5:05:00    30   2           60
1
votes

Another way would be using data.table

#Let's say df is your dataset
library(data.table)
dt = as.data.table(df)
dt = dt[, Sum.A1.Daily := sum(Axis1), by = Date]