20
votes

I want to plot two variables on one plot similar to a population pyramid similar to this:

plot

This plot is almost there but not quite, for the reasons I will list below.

I produced this plot with the following code:

DATA <- data.frame(
    state = c("AK", "TX", "CA", "MT", "NM", "AZ", "NV", "CO", "OR", "WY", "MI", "MN", "UT", "ID", "KS", "NE", "SD", "WA", "ND", "OK"),
    sales_staff = c(20,30,40,10,15,35,18,25,22,7,12,22,3,4,5,8,14,28,24,32)
)

set.seed(1)
DATA$sales <- DATA$sales_staff * 50 + (runif(nrow(DATA)) * 1000)

# Order the state factor by number of sales staff so that it is plotted in that order
DATA$state <- factor(DATA$state, levels = DATA[order(DATA$sales_staff),"state"])

I want to "glue" two plots back-to-back, so I use the multiplot() function verbatim from http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_%28ggplot2%29/

(I won't reproduce the code for that function here for brevity and clarity)

My code for the final plot is:

library(ggplot2)

g1 <- ggplot(data = DATA, aes(x = state, y = sales_staff)) +
  geom_bar(stat = "identity") + ggtitle("Number of sales staff") +
  theme(axis.title.x = element_blank(), axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank(), plot.margin = unit(c(1,0,1,0), "mm")) +
  scale_y_reverse() + coord_flip()

g2 <- ggplot(data = DATA, aes(x = state, y = sales)) +
  geom_bar(stat = "identity") + ggtitle("Sales (x $1000)") +
  theme(axis.title.x = element_blank(), axis.title.y = element_blank(), plot.margin = unit(c(1,5,1,0), "mm")) +
  coord_flip()

multiplot(g1, g2, cols = 2)

OK. So what's wrong with this plot?

  • I need to get the tick marks on the right-axis of the left plot. I can't figure out how to do that.
  • The two plots are different widths. This is because the states down the middle are the axis labels from the right plot and use some of the space for that plot.

I've hit a brick wall in getting this plot to "production quality". I'm starting to wonder if I'm going about this in the wrong way because I'm thinking the next step will be to draw the axis labels as a separate third column between the two plots. (I don't yet know how to do this). This will solve the "equal size" issue and permit me to add a "state" title, so it may still be the way to go. But I just can't help but wonder if there is a simpler way...

Any advice or assistance appreciated!

1
Can you adjust stackoverflow.com/questions/14680075/… for your purpose?mnel
I am a diehard ggplot2 fan and use the package for absolutely everything I can, but when I needed to create a pyramid plot similar to the one above a few months back I eventually gave up and used pyramid.plot from the plotrix package. No, I didn't like the syntax, but it wasn't hard and I got better results than with ggplot2 and with far less messing around.SlowLearner
You should search for "back to back charts".Roman Luštrik
There is a recently added, excellent demonstration of how to fine tune a ggplot2 back-to-back bar chart: stackoverflow.com/a/37374271/3817004Uwe

1 Answers

29
votes

Here is very long workaround for your plot. Idea is to create new plot that contains only state names and ticks on both sides and then use this as the middle plot.

For this plot I added title with no name to get space and ylab(NULL) to remove space. For left and right margin values are -1 to get plot closer to other plots. Library grid should be added before plotting to use function unit() for plot margins.

library(grid)
g.mid<-ggplot(DATA,aes(x=1,y=state))+geom_text(aes(label=state))+
  geom_segment(aes(x=0.94,xend=0.96,yend=state))+
  geom_segment(aes(x=1.04,xend=1.065,yend=state))+
  ggtitle("")+
  ylab(NULL)+
  scale_x_continuous(expand=c(0,0),limits=c(0.94,1.065))+
  theme(axis.title=element_blank(),
        panel.grid=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(),
        panel.background=element_blank(),
        axis.text.x=element_text(color=NA),
        axis.ticks.x=element_line(color=NA),
        plot.margin = unit(c(1,-1,1,-1), "mm"))

Both original plots are modified. First, removed the y axis for the second plot and also made left/right margin to -1.

g1 <- ggplot(data = DATA, aes(x = state, y = sales_staff)) +
  geom_bar(stat = "identity") + ggtitle("Number of sales staff") +
  theme(axis.title.x = element_blank(), 
        axis.title.y = element_blank(), 
        axis.text.y = element_blank(), 
        axis.ticks.y = element_blank(), 
        plot.margin = unit(c(1,-1,1,0), "mm")) +
  scale_y_reverse() + coord_flip()

g2 <- ggplot(data = DATA, aes(x = state, y = sales)) +xlab(NULL)+
  geom_bar(stat = "identity") + ggtitle("Sales (x $1000)") +
  theme(axis.title.x = element_blank(), axis.title.y = element_blank(), 
        axis.text.y = element_blank(), axis.ticks.y = element_blank(),
        plot.margin = unit(c(1,0,1,-1), "mm")) +
  coord_flip()

Now use library gridExtra and function d grid.arrange() to join plots. Before plotting all plots are made as grobs.

library(gridExtra)
gg1 <- ggplot_gtable(ggplot_build(g1))
gg2 <- ggplot_gtable(ggplot_build(g2))
gg.mid <- ggplot_gtable(ggplot_build(g.mid))

grid.arrange(gg1,gg.mid,gg2,ncol=3,widths=c(4/9,1/9,4/9))

enter image description here