0
votes

I have data collected over multiple days, with timestamps that contain information for when food was eaten. Example dataframes:

head(Day3)

==================================================================
Day3.time        Day3.Pellet_Count
1  18:05:30                 1
2  18:06:03                 2
3  18:06:34                 3
4  18:06:40                 4
5  18:06:52                 5
6  18:07:03                 6

head(Day4)

==================================================================
Day4.time Day4.Pellet_Count
1  18:00:21                 1
2  18:01:34                 2
3  18:02:22                 3
4  18:03:35                 4
5  18:03:54                 5
6  18:05:06                 6

Given the variability, the timestamps don't line up and therefore aren't matched. I've done a "full join" with merge from all of the data from two of the days, in the following way:

pellets <- merge(Day3, Day4, by = 'time', all=TRUE)

This results in the following:

head(pellets)

==================================================================
pellets.time         pellets.Pellet_Count.x   pellets.Pellet_Count.y
1     02:40:18                     39                     NA
2     18:00:21                     NA                      1
3     18:01:34                     NA                      2
4     18:02:22                     NA                      3
5     18:03:35                     NA                      4
6     18:03:54                     NA                      5

I would like to plot the Pellet_Count in one line graph from each of the days, but this is making it very difficult to group the data. My approach thus far has been:

pelletday <- ggplot() + geom_line(data=pellets, aes(x=time, y=Pellet_Count.x)) + geom_line(data=pellets, aes(x=time, y=Pellet_Count.y))

But, I get this error:

geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

I also would like to be able to merge all days (I oftentimes have up to 9 days) and plot it on the same graph.

I believe my goal is to ultimately get the following dataframe output:

==================================================================
pellets.time                 Pellet_Count                Day
1     02:40:18                     39                     3
2     18:00:21                     1                      4
3     18:01:34                     2                      4
4     18:02:22                     3                      4
5     18:03:35                     4                      4
6     18:03:54                     5                      4

and to use this to graph:

ggplot(pellets, aes(time, Pellet_Count, group=Day)

Any ideas?

2
Please provide some sample data.markhogue
Sorry, am a newbie and accidentally uploaded before I finished!neurofood
If you follow these tips for data sharing, you'll get better answers. stackoverflow.com/questions/5963269/… You need to do a better job of combining your data. Something like cbind would probably work. But you need to do it in a way where you add a day variable. Then geom_line() will work if you edit the aes above with color = day and/or linetype = daymarkhogue
I meant rbind(), not cbind()markhogue

2 Answers

1
votes

There's a couple of issues here

Firstly have you tried using rbind() or bind_rows() rather than merge.

This seems like a more natural fit for what you're trying to do. With a merge or some other join, you are effectively trying to bring new information into your data table. Most often you are trying to bring in new columns

But here you are really trying to append days' data together, you're not actually adding a new column.

So this is my attempt at replicating what you're describing above

Day3 <- tibble(
        Day3.time = c('18:05:30', '18:06:03', '18:06:34', 
                      '18:06:40', '18:06:52', '18:07:03'),
        Day3.Pellet_Count = c(1, 2, 3, 4, 5, 6)) %>% 
        mutate(day = '3') %>%
        rename(time = Day3.time)


Day4 <- tibble(
        Day4.time = c('18:00:21', '18:01:34', '18:02:22', 
                      '18:03:35', '18:03:54', '18:05:06'),
        Day4.Pellet_Count = c(1, 2, 3, 4, 5, 6)) %>% 
        mutate(day = '4') %>%
        rename(time = Day4.time)



pellets <- merge(Day3, Day4, by = 'time', all=TRUE)

       time Day3.Pellet_Count day.x Day4.Pellet_Count day.y
1  18:00:21                NA  <NA>                 1     4
2  18:01:34                NA  <NA>                 2     4
3  18:02:22                NA  <NA>                 3     4
4  18:03:35                NA  <NA>                 4     4
5  18:03:54                NA  <NA>                 5     4
6  18:05:06                NA  <NA>                 6     4
7  18:05:30                 1     3                NA  <NA>
8  18:06:03                 2     3                NA  <NA>
9  18:06:34                 3     3                NA  <NA>
10 18:06:40                 4     3                NA  <NA>
11 18:06:52                 5     3                NA  <NA>
12 18:07:03                 6     3                NA  <NA>

And here is how you would work with bind_rows(), (rbind works the same) this should get you more useful data to work with

 pettets <- bind_rows(Day3 %>% 
+                              rename(Pellet_Count = Day3.Pellet_Count), 
+                      Day4 %>% 
+                              rename(Pellet_Count = Day4.Pellet_Count))
> pettets
# A tibble: 12 x 3
   time     Pellet_Count day  
   <chr>           <dbl> <chr>
 1 18:05:30            1 3    
 2 18:06:03            2 3    
 3 18:06:34            3 3    
 4 18:06:40            4 3    
 5 18:06:52            5 3    
 6 18:07:03            6 3    
 7 18:00:21            1 4    
 8 18:01:34            2 4    
 9 18:02:22            3 4    
10 18:03:35            4 4    
11 18:03:54            5 4    
12 18:05:06            6 4   

Secondly you probably need to find a way to handle the dates. So with your Ggplot code a big problem is that you are passing characters where you want to pass date / time data. to get a useful datetime format I think you'll need to have the date.

0
votes

You first need to convert your data from 'wide' to 'long' format (see example here). After this, you should be able to use ggplot (looks like you tried to use base R plot logic here with lines but it doesn't work with ggplot).

For example:

pellets %>% gather("day", "count", -pellets.time) %>% na.omit()

All together it will be:

pellets %>% rename(Day3 = pellets.Pellet_Count.x, Day4 = pellets.Pellet_Count.y) %>% gather("day", "count", -pellets.time) %>% na.omit() %>% ggplot() + geom_point(aes(x=pellets.time, y=count, col=day))

(I added rename to match your preferred output)