I have been using R/ggplot2 to successfully generate graphs that are very close to those generated by a commercial tool used in my company. But there are a few features that I am unable to implement to my boxplots generated using R.
- A summary "table" at the bottom of the graph showing items like, median, count, outliers.
- Display the strip text such that duplicate labels are removed.
Sample R code:
library(ggplot2)
library(data.table)
library(reshape2)
library(grid)
# create dataset
dt <- data.table(mpg)
# melt the data table
dtm <- data.table(
melt(data=dt,
id.vars=c("manufacturer","model","displ","year","cyl","trans","drv","fl","class"),
variable.name="mode", value.name="mpg"))
write.csv(dtm,file="dtm.csv",row.names=F)
# draw some plots
p <- ggplot(dtm, aes(x=mode,y=mpg)) +
geom_boxplot(aes(fill=mode), varwidth=F) +
facet_grid( ~ manufacturer + year ) +
theme_bw() +
theme(panel.margin=unit(0,"mm"), panel.grid=element_blank()) +
theme(axis.text.x=element_blank(), axis.title.x=element_blank()) +
theme(legend.position="bottom") +
coord_cartesian(ylim=c(0,50))
p
ggsave(plot=p, filename='ddtm_r.png', dpi=72, width=16, height=8)
The output generated by R is here,
The same boxplot generated by the commercial tool is here,
As you can see the data in the graphs are idential (as expected), but the strip headers are better organized in the commercial version, and I can add the summary table at the bottom of the graph as well.
Is there any way in R to do something similar ?
Thanks & regards,
Derric
UPDATED: 7 June 2014
With the suggestions from colleagues and online help, I am able to now plot graphs that contain the boxplot with the summary table appended to the bottom of the boxplot. The main idea was to extract the panel information from the boxplot grob and then use this to generate the text table, and then redraw the plot with the two grobs. The legend of the original boxplot had to be removed in order to get correct right hand side alignment.
The modified R code is as follows,
# load the libraries
library(grid)
library(gridExtra)
library(data.table)
library(reshape2)
library(ggplot2)
library(gtable)
library(plyr)
# create dataset
dt <- data.table(mpg)
# melt the data table
dtm <- data.table(
melt(data=dt,
id.vars=c("manufacturer","model","displ","year","cyl","trans","drv","fl","class"),
variable.name="mode", value.name="mpg"))
#write.csv(dtm,file="dtm.csv",row.names=F)
# draw some plots
p1 <- ggplot(dtm, aes(x=factor(year),y=mpg)) +
geom_boxplot(aes(fill=factor(year)), varwidth=F) +
facet_grid( ~ manufacturer + mode ) +
theme_bw() +
theme(panel.margin=unit(0,"lines"),
panel.grid=element_blank(),
strip.text=element_text(angle=90),
axis.text.x=element_blank(),
axis.title.x=element_blank(),
axis.ticks.x=element_blank(),
plot.margin=unit(c(0,0,0,0),"lines"),
legend.position="right") +
coord_cartesian(ylim=c(0,50)) +
xlab(NULL)
# deconstruct the plot p1
pb <- ggplot_build(p1)
# pb has three groups; data, panel and plot
pb.data <- pb$data
# pb.data[[1]] is a data.frame
pb.data.df <- pb.data[[1]]
# melt the pb.data.df
pb.data.dt <- data.table(pb.data.df)
#pb.data.dt[,':='(outliers=NULL)]
pb.data.dtm <- melt(data=pb.data.dt,
#id.vars=c("x","PANEL"),
measure.vars=c("middle","lower","upper"),
variable.name="mode",
value.name="value")
p2 <- ggplot(pb.data.dtm, aes(x=factor(x),y=factor(mode),label=format(value,nsmall=1))) +
geom_text(size=3.0, angle=90, hjust=0.5) + facet_grid(~ PANEL) +
theme_bw() +
scale_y_discrete() +
theme(panel.margin=unit(0,"lines"),
panel.grid=element_blank(),
panel.border=element_rect(),
legend.position="right",
axis.text.x=element_blank(),
axis.text.y=element_text(angle=0),
axis.ticks=element_blank(),
strip.text=element_blank(),
strip.background=element_blank(),
plot.margin=unit(c(0,0,0,0),"lines")
) +
xlab(NULL) + ylab(NULL)
# a function to extract the legend from the grob
g_legend <- function(a.gplot) {
tmp <- ggplotGrob(a.gplot)
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
legend1 <- g_legend(p1)
pa <- arrangeGrob((p1 + theme(legend.position='none')), legend1,
(p2 + theme(legend.position='none')),
ncol=2, nrow=2,
heights=c(50/60,10/60), widths=c(95/100,5/100))
ggsave(plot=pa, filename='dtm_r.png',dpi=72,height=10,width=18)
This script now gives me a graph that is shown below,
The issue I have with this graph is that the panels are slightly mis-aligned on the left hand side. Any suggestions on how to align both the left hand side and right hand side of the panels ?
Thanks,