3
votes

I am trying to plot the following data frame, where there are 3 different time series (identified by user0, user1, and user2). Each row has a user identifier, the date, and a value.

> df
   userId       date steps
1   user0 2016-03-24   794
2   user0 2016-03-25   562
3   user0 2016-03-26   682
4   user0 2016-03-27   722
5   user0 2016-03-28   883
6   user1 2016-03-24  3642
7   user1 2016-03-25  3776
8   user1 2016-03-26  3585
9   user1 2016-03-27  3585
10  user1 2016-03-28  3471
11  user2 2016-03-24  5959
12  user2 2016-03-25  5933
13  user2 2016-03-26  5802
14  user2 2016-03-27  6094
15  user2 2016-03-28  5903
> dput(df)
structure(list(userId = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("user0", "user1", 
"user2"), class = "factor"), date = structure(c(16884, 16885, 
16886, 16887, 16888, 16884, 16885, 16886, 16887, 16888, 16884, 
16885, 16886, 16887, 16888), class = "Date"), steps = c(794L, 
562L, 682L, 722L, 883L, 3642L, 3776L, 3585L, 3585L, 3471L, 5959L, 
5933L, 5802L, 6094L, 5903L)), .Names = c("userId", "date", "steps"
), row.names = c(NA, -15L), class = "data.frame")

I would like plot all the time series (however many there are identified by the userId field) using different colors and with the date as the x-axis. I tried the following, but as you can see, the dates are repeated on the x-axis.

plot(df$steps, axes=F, xlab="", ylab="Steps", ylim=c(0,max(df$steps)))
axis(2)
axis(1, at = seq_along(df$date), labels = df$date, las = 2, cex.axis = 0.70)
box()

enter image description here

I looked at other postings, such as "Plot multiple lines (data series) each with unique color in R" and "Plotting multiple time series on the same plot using ggplot()", but they do not have my problem of the time variable being mixed in with the other data.

A solution using color lines with and without ggplot would be greatly appreciated.

2

2 Answers

2
votes

With ggplot:

library(ggplot2)
ggplot(df, aes(x = date, y = steps, colour = userId)) + geom_line()

plot with 3 lines and date x-axis


An equivalent (but still pretty ugly) base R version takes a lot more work:

plot(0, type = 'n', axes = FALSE, xlab = 'date', ylab = 'steps',
     xlim = c(min(df$date), max(df$date)), 
     ylim = c(min(df$steps) - 100, max(df$steps) + 100))
axis.Date(1, df$date, format = '%F')    # `axis.Date` is helpful here
axis(2, seq(0, max(df$steps + 500), 500))
box()
lapply(split(df, df$userId), function(x){lines(x$date, x$steps, 
                                               col = as.numeric(substr(x$userId, 5, 5)) + 1)})
# `paste` extra space to align legend correctly...*sigh*
legend('bottomright', paste(levels(df$userId), '   '), col = 1:3, lty = 1)

base R multi-line plot

Note that it needs a good bit of fine-tuning.

3
votes

Here is a base R version:

plot(0, 0, type = "n", xlim = range(df$date), ylim = c(0, max(df$step)), axes = FALSE, xlab = "", ylab = "steps")
axis(2, las = 1)
axis(1, at = df$date, labels = df$date, las = 2, cex.axis = 0.70)
box()

cols <- c("red", "green", "blue")
for (i in 1:length(unique(df$userId)))
  with(df[df$userId == unique(df$userId)[i], ], lines(date, steps, col = cols[i]))

enter image description here