0
votes

I am trying to plot a density line with and want to shade or fill only the area associated with the 95% of the x axis. I am trying to follow answers given in the attached answers, but non of them talk about shading an area when we are plotting more than one distributions at the same time with a grouping factor. In this case the grouping factor is the different central electrodes ("Fz", "Cz, "Pz"). I am trying to visualise something similar to the highest density interval, or the area under the curve comprising between percentile 5 and 95.

Area Under Curve AUC by Group

My data looks something like this:

> head(dframe1)
         x            y Electrode
1 1.571296 0.0001474116        Fz
2 1.576496 0.0001487649        Fz
3 1.581697 0.0001497564        Fz
4 1.586897 0.0001504074        Fz
5 1.592098 0.0001507446        Fz
6 1.597298 0.0001507776        Fz

And at the moment the code I am using to plot the distributions by group in ggplot looks like this:

p1 <- ggplot(data = dframe1, mapping = aes(x = x, y = y)) +
  geom_density_line(stat = "identity", size=.5, alpha=0.3, aes(color=Electrode, fill=Electrode)) +
  scale_fill_discrete(breaks=c("Fz","Cz","Pz")) +
  guides(colour = FALSE) +
  geom_vline(xintercept = 0) +
  xlab("values") +
  xlim(-2, 10) +
  ylab("density") +
  ylim(0, .7) +
  theme(axis.text=element_text(size=12),
        axis.title=element_text(size=16),
        plot.title = element_text(size=18)) +
  labs(title="Interval")

I sketched something similar to what I am looking for:

skribble of desired outcome

Of course I could use bayestestR HDI standard output but I prefer the ggplot aesthetic and flexibility.

Any help will be much appreciated.

1
Could you please share a significant amount of your data using dput() the plot is empty when we use the small data you shared!Duck
Alternatively, generate your data using, say, dnorm to avoid the need to dput a large data frame and permit easy validation of solutions.Limey
I'm sorry, I didn't attach the data, but it really were 3 normal distributions. I had never used dput() so I did not know what to do. Thanks for your comments though.Unai Vicente

1 Answers

3
votes

Since you need to do some maths on the density curves to work out where the 95% intervals are, it is best to do this outside ggplot. I often find that people run into problems because they try to get ggplot to do too much of their data wrangling and summarizing. It is often easier to work out what you want to plot, then plot it.

In your case, your x and y co-ordinates already represent densities. For each Electrode, you just need to create a logical vector that tells you when the integral of the density is between 0.025 and 0.975, so that you can easily subset out the 95% confidence interval. You can do that using the split-aplly-bind method like this:

densdf <- do.call(rbind, lapply(split(dframe1, dframe1$Electrode), function(z)
{
  integ <- cumsum(z$y * mean(diff(z$x)))
  CI <-  integ > 0.025 & integ < 0.975
  data.frame(x = z$x, y = z$y, Electrode = z$Electrode[1], CI = CI)
}))

Now we are ready to plot:

ggplot(data = densdf, mapping = aes(x = x, y = y)) +
  geom_area(data = densdf[densdf$CI,], 
            aes(fill = Electrode, color = Electrode),
            outline.type = "full", alpha = 0.3, size = 1) +
  geom_line(aes(color = Electrode), size = 1) +
  scale_fill_discrete(breaks = c("Fz", "Cz", "Pz")) +
  guides(colour = FALSE) +
  geom_vline(xintercept = 0) +
  geom_hline(yintercept = 0) +
  lims(x = c(-2, 10), y = c(0, 0.7)) +
  labs(title = "Interval", x = "values", y = "density") +
  theme_bw() +
  theme(axis.text  = element_text(size = 12),
        axis.title = element_text(size = 16),
        plot.title = element_text(size = 18)) 

enter image description here