4
votes

I have scripted a ggplot compiled from two separate data frames, but as it stands there is no legend as the colours aren't included in aes. I'd prefer to keep the two datasets separate if possible, but can't figure out how to add the legend. Any thoughts?

I've tried adding the colours directly to the aes function, but then colours are just added as variables and listed in the legend instead of colouring the actual data.

Plotting this with base r, after creating the plot I would've used:

legend("top",c("Delta 18O","Delta 13C"),fill=c("red","blue")

and gotten what I needed, but I'm not sure how to replicate this in ggplot.

The following code currently plots exactly what I want, it's just missing the legend... which ideally should match what the above line would produce, except the "18" and "13" need superscripted.

Examples of an old plot using base r (with a correct legend, except lacking superscripted 13 and 18) and the current plot missing the legend can be found here: Old: https://imgur.com/xgd9e9C New, missing legend: https://imgur.com/eGRhUzf

Background data

head(avar.data.x)
      time          av       error
1 1.015223 0.030233604 0.003726832
2 2.030445 0.014819145 0.005270609
3 3.045668 0.010054801 0.006455241
4 4.060891 0.007477541 0.007453974
5 5.076113 0.006178282 0.008333912
6 6.091336 0.004949045 0.009129470
head(avar.data.y)
      time         av       error
1 1.015223 0.06810001 0.003726832
2 2.030445 0.03408136 0.005270609
3 3.045668 0.02313839 0.006455241
4 4.060891 0.01737148 0.007453974
5 5.076113 0.01405144 0.008333912
6 6.091336 0.01172788 0.009129470

The following avarn function produces a data frame with three columns and several thousand rows (see header above). These are then graphed over time on a log/log plot.

avar.data.x <- avarn(data3$"d Intl. Std:d 13C VPDB - Value",frequency)

avar.data.y <- avarn(data3$"d Intl. Std:d 18O VPDB-CO2 - Value",frequency)

Create allan deviation plot

ggplot()+
      geom_line(data=avar.data.y,aes(x=time,y=sqrt(av)),color="red")+
      geom_line(data=avar.data.x,aes(x=time,y=sqrt(av)),color="blue")+
      scale_x_log10()+
      scale_y_log10()+
      labs(x=expression(paste("Averaging Time ",tau," (seconds)")),y="Allan Deviation (per mil)")

The above plot is only missing a legend to show the name of the two plotted datasets and their respective colours. I would like the legend in the top centre of the graph.

How to superscript legend titles?:

ggplot()+
  geom_line(data=avar.data.y,aes(x=time,y=sqrt(av), 
color =expression(paste("Delta ",18^,"O"))))+
  geom_line(data=avar.data.xmod,aes(x=time,y=sqrt(av), 
color=expression(paste("Delta ",13^,"C"))))+
  scale_color_manual(values = c("blue", "red"),name=NULL) +
  scale_x_log10()+
  scale_y_log10()+
  labs(
    x=expression(paste("Averaging Time ",tau," (seconds)")),
    y="Allan Deviation (per mil)") + 
  theme(legend.position = c(0.5, 0.9))
2

2 Answers

4
votes

Set color inside the aes and add a scale_color_ function to your plot should do the trick.

ggplot()+
  geom_line(data=avar.data.y,aes(x=time,y=sqrt(av), color = "a"))+
  geom_line(data=avar.data.x,aes(x=time,y=sqrt(av), color="b"))+
  scale_color_manual(
    values = c("red", "blue"),
    labels = expression(avar.data.x^2, "b")
  ) +
  scale_x_log10()+
  scale_y_log10()+
  labs(
    x=expression(paste("Averaging^2 Time ",tau," (seconds)")),
    y="Allan Deviation (per mil)") + 
  theme(legend.position = c(0.5, 0.9))
3
votes

@z-cool merits the accepted answer. However, the current approaches do not use ggplots amazing aesthetics. Your data frames seem all to have the same structure. Now, this in mind, a more ggplot like way would be to make one single long data frame and use color (or any aesthetic) like so:

avar.data.x <- readr::read_table("0 time          av       error
1 1.015223 0.030233604 0.003726832
2 2.030445 0.014819145 0.005270609
3 3.045668 0.010054801 0.006455241
4 4.060891 0.007477541 0.007453974
5 5.076113 0.006178282 0.008333912
6 6.091336 0.004949045 0.009129470") 
avar.data.y <- readr::read_table("0 time         av       error
1 1.015223 0.06810001 0.003726832
2 2.030445 0.03408136 0.005270609
3 3.045668 0.02313839 0.006455241
4 4.060891 0.01737148 0.007453974
5 5.076113 0.01405144 0.008333912
6 6.091336 0.01172788 0.009129470")

library(tidyverse)

combine_df <- bind_rows(list(a = avar.data.x, b = avar.data.y), .id = 'ID')

ggplot(combine_df)+
  geom_line(aes(x = time, y = sqrt(av), color = ID))+
  scale_color_manual(values = c("red", "blue"),
    labels = c(expression("Delta 18"^"O"), expression("Delta 13"^"C"))) 

Created on 2019-11-11 by the reprex package (v0.2.1)

This gives only one call to geom_line and easier and better control of the legend(s). You could even make some fancy function to automate your labels. etc.

Also note that white spaces in column names are not great (you're making your own life very difficult) and that you may want to think about automating your avarn calls, e.g. with lapply, which would result in a list of data frames and makes the binding of the data frames even easier.