I have many graphics with two times series plotted on them.
That is to say, I have one plot of y_1 and y_2 against a common set of dates.
For each plot, I would like to present the correlation on the plot between each pair of series. That is to say I would like to compute: cor(y_1,y_2) and include the resulting number on each plot.
This is surprisingly difficult to do in a principled way in ggplot2. I've found no simple way to do it using stat_cor so far.
I have already looked at other functions recommended for this task, but they are all designed for reporting the correlation of y_1 and y_2 in situations in which y_1 is plot against y_2 rather than both y_1 and y_2 are plot against time.
I would prefer a ggplot2-ish way to do this but I'm open to using any graphics software within R. Here is code for a minimal working example and what I have tried.
library(reprex); library(ggplot2); library(ggpubr)
n <- 6;
Q=sample(18:30, n, replace=TRUE)
# make sample data
dat <- data.frame(id=1:n,
date=seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"),
group=rep(LETTERS[1:2], n/2),
quantity= Q,
price= 100 - 2*Q + rnorm(n))
dat
#> id date group quantity price
#> 1 1 2020-12-26 A 19 63.02628
#> 2 2 2020-12-27 B 26 49.66597
#> 3 3 2020-12-28 A 27 44.98031
#> 4 4 2020-12-29 B 24 51.11224
#> 5 5 2020-12-30 A 29 41.11129
#> 6 6 2020-12-31 B 28 43.04494
tseriesplot <- ggplot(dat, aes(x = date)) + ggtitle("Oil: Daily Quantity and Price") +
geom_line(aes(y = Q, color = "Quantity (thousands of barrels)")) +
geom_line(aes(y = price, color = "Price"))
tseriesplot
# naive attempt fails
tseriesplot + stat_cor(data = dat, aes(x=quantity, y=price),method="pearson")
#> Error: Invalid input: date_trans works with objects of class Date only
Created on 2021-01-05 by the reprex package (v0.3.0)
I thought this would be a good question because it is similar to more complex questions elsewhere, e.g. https://stat.ethz.ch/pipermail/r-help/2020-July/467805.html but much more basic.