1
votes

I have a basic dataframe with 3 columns: (i) a date (when a sample was taken); (ii) a site location and (iii) a binary variable indicating what the condition was when sampling (e.g. wet versus dry).

Some reproducible data:

df <- data.frame(Date = rep(seq(as.Date("2010-01-01"), as.Date("2010-12-01"), by="months"),times=2))
df$Site <- c(rep("Site.A",times = 12),rep("Site.B",times = 12))
df$Condition<- as.factor(c(0,0,0,0,1,1,1,1,0,0,0,0,
                     0,0,0,0,0,1,1,0,0,0,0,0))

What I would like to do is use ggplot to create a bar chart indicating the condition of each site (y axis) over time (x axis) - the condition indicated by a different colour. I am guessing some kind of flipped barplot would be the way to do this, but I cannot figure out how to tell ggplot2 to recognise the values chronologically, rather than summed for each condition. This is my attempt so far which clearly doesn't do what I need it to.

ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()

So I have 2 questions. Firstly, how do I tell ggplot to recognise changes in condition over time and not just group each condition in a traditional stacked bar chart?

Secondly, it seems ggplot converts the date to a numerical value, how would I reformat the x-axis to show a time period, e.g. in a month-year format? I have tried doing this via the scale_x_date function, but get an error message.

labDates <- seq(from = (head(df$Date, 1)), 
               to = (tail(df$Date, 1)),  by = "1 months")
Datelabels <-format(labDates,"%b %y")

ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()+
scale_x_date(labels = Datelabels, breaks=labDates)

I have also tried converting sampling times to factors and displaying these instead. Below I have done this by changing each sampling period to a letter (in my own code, the factor levels are in a month-year format - I put letters here for simplicity). But I cannot format the axis to place each level of the factor as a tick mark. Either a date or factor solution for this second question would be great!

df$Factor <- as.factor(unique(df$Date))
levels(df$Factor) <- list(A = "2010-01-01", B = "2010-02-01", 
C = "2010-03-01", D = "2010-04-01", E = "2010-05-01", 
`F` = "2010-06-01", G = "2010-07-01", H = "2010-08-01", 
I = "2010-09-01", J = "2010-10-01", K= "2010-11-01", L = "2010-12-01")

ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()+
scale_y_discrete(breaks=as.numeric(unique(df$Date)),
 labels=levels(df$Factor))

Thank you in advance!

1
Why are you converting dates to factors? Take a look at scale_x_date for formatting a date axiscamille
When I try this, I get the following error message: 'Error: Invalid input: date_trans works with objects of class Date only' - I think because ggplot is reading the dates as a numeric valueJames White
That's because they are not in a Date format when called with ggplot, try with seq.Date() instead or use as.Date().RLave

1 Answers

1
votes

It doesn't really make sense to use geom_bar() considering you do not want to summarise the data and require the visualisation over "time"

I would rather use geom_line() and increase the line thickness if you want to portray a bar chart.

library(tidyr)
library(dplyr)
library(ggplot2)
library(scales)
library(lubridate)

df <- data.frame(Date = rep(seq.Date(as.Date("2010-01-01"), as.Date("2010-12-01"), by="months"),times=2))
df$Site <- c(rep("Site.A",times = 12),rep("Site.B",times = 12))
df$Condition<- as.factor(c(0,0,0,0,1,1,1,1,0,0,0,0,
                           0,0,0,0,0,1,1,0,0,0,0,0))
df$Date <- ymd(df$Date)

ggplot(df) +
  geom_line(aes(y=Site,x=Date,color=Condition),size=10)+
  scale_x_date(labels = date_format("%b-%y"))

enter image description here

Note using coord_flip() also does not work, I think this causes the Date issue, see below threads:

how to use coord_carteisan and coord_flip together in ggplot2

In ggplot2, coord_flip and free scales don't work together