8
votes

This question arose when I was trying to draw a standard normal distribution with ggplot (easy to do with stat_function) and also color the area under the curve for different quintiles -

enter image description here

I was able to do this with geom_line and geom_ribbon after I created a data frame with a range of values for x and the corresponding dnorm values for each x as y

    data = data.frame(x=seq(-3,3,length.out=1000))
data$y=dnorm(data$x)
data$Quintile <- with(data,ifelse(x<qnorm(0.2),"Bottom",
                               ifelse(x<qnorm(0.4),"Second",
                                      ifelse(x<qnorm(0.6),"Middle",
                                             ifelse(x<qnorm(0.8),"Fourth","Top")))))
data$Quintile <- factor(data$Quintile, levels=c("Bottom","Second","Middle","Fourth","Top"))

ggplot(data,aes(x=x,y=y,fill=Quintile))+geom_ribbon(aes(ymax=y),ymin=0,alpha=0.5)+
  geom_line(color="black")+theme_bw()+theme(legend.position="bottom")+
  scale_fill_manual(values=c("darkgreen","red","purple","blue","gray"))+
  geom_vline(xintercept=c(qnorm(c(0.2,0.4,0.6,.8))),color=c("darkgreen","red","purple","blue"),size=1)+
  scale_y_continuous("",breaks=NULL)+scale_x_continuous("",breaks=NULL)

I find it more appealing to use stat_function and I guess it must be creating its set of y values to plot the line - I tried to access those on other layers to add the colored bands but was unable to do it - I want to see if someone can explain how that can be done or why we can't

In other words instead of generating data myself, and use geom_line to draw the curve, I want to do something like

ggplot(NULL,aes(x=c(-3,3))) + stat_function(fun=dnorm)

and the use the data that stat_function generated to do the coloring - I was not able to get access the generated y values (i tried using ..y.. for example)

Is there a way to use those values? if so how?

2
You'll make it easier to help you if you provide the code you used to create your data frame and draw your plot. - eipi10

2 Answers

4
votes
ggplot(NULL,aes(x=c(-3,3))) + 
  stat_function(fun=dnorm, geom="ribbon",
                mapping = aes(ymin=0,ymax=..y..))
3
votes

You can use stat_function. See link below for an example.

http://rstudio-pubs-static.s3.amazonaws.com/58753_13e35d9c089d4f55b176057235778679.html

Example plot:

enter image description here