2
votes

I am trying to achieve the following task with Knitr, ggplot2 and xtables:

  • Generate several annotated plots of beta-distributions with ggplot2
  • Write the output in a layout such that I have a plot, and a corresponding summary Stats table following it, for every plot.
  • Write the code such that both PDF and HTML reports can be a generated in a presentable way

Here is my attempt at this task (Rnw file):

\documentclass{article}

\begin{document}

Test for ggplot2 with Knitr

<<Initialize, echo=FALSE>>=
library(ggplot2)
library(ggthemes)
library(data.table)
library(grid)
library(xtable)
library (plyr)

pltlist <- list()
statlist <- list()

@

The libraries are loaded. Now run the main loop


<<plotloop, echo=FALSE>>=
    for (k in seq(1,7)){
      x <- data.table(rbeta(100000,1.6,14+k))
      xmean <- mean(x$V1, na.rm=T)
      xqtl <- quantile(x$V1, probs = c(0.995), names=F)
      xdiff <- xqtl - xmean
      dens <- density(x$V1)
      xscale <- (max(dens$x, na.rm=T) - min(dens$x, na.rm=T))/100
      yscale <- (max(dens$y, na.rm=T))/100
      y_max <- max(dens$y, na.rm=T)
      y_intercept <- y_max-(10*yscale)
      data <- data.frame(x)

      y <- ggplot(data, aes(x=V1)) + geom_density(colour="darkgreen", size=2, fill="green",alpha=.3) +
        geom_vline(xintercept = xmean, colour="blue", linetype = "longdash") +
        geom_vline(xintercept = xqtl, colour="red", linetype = "longdash") +
        geom_segment(aes(x=xmean, xend=xqtl, y=y_intercept, yend=y_intercept), colour="red", linetype = "solid", arrow = arrow(length = unit(0.2, "cm"), ends = "both", type = "closed")) +
        annotate("text", x = xmean+xscale, y = y_max, label = paste("Val1:",round(xmean,4)), hjust=0) +
        annotate("text", x = xqtl+xscale, y = y_max, label = paste("Val2:",round(xqtl,4))) +
        annotate("text", x = xmean+10*xscale, y = y_max-15*yscale, label = paste("Val3:",round(xdiff,4))) +
        xlim(min(dens$x, na.rm=T), xqtl + 9*xscale) +
        xlab("Values") +
        ggtitle("Beta Distribution") +
        theme_bw() +
        theme(plot.title = element_text(hjust = 0, vjust=2))

      pltlist[[k]] <- y
      statlist[[k]] <- list(mean=xmean, quantile=xqtl) 

}

stats <- ldply(statlist, data.frame)
@

Plots are ready. Now Plot them

<<PrintPlots, warning=FALSE, results='asis', echo=FALSE, cache=TRUE,  fig.height=3.5>>=
for (k in seq(1,7)){
  print(pltlist[[k]])
  print(xtable(stats[k,], caption="Summary Statistics", digits=6))
}

@

Plotting Finished.


\end{document}

I am faced with several issues after running this code.

  1. When I run this code just as R code, Once I try to print the plots in the list, the horizontal line from the geom_segment part starts to move all over the place. However if I plot the figures individually, without putting them in a list, the figures are fine, as I would expect them to be.
  2. Only the last plot is as I would expect the output to be, in all the other plots, the geom_segment line moves around randomly.
  3. I am also unable to put a separate caption for the Plots as I can for the Tables.

Points to note :

  • I am storing the beta-random numbers in data.table since in our actual code, we are using data.table. However for the purposes of testing ggplot2 in this way, I convert the data.table into a data.frame, as ggplot2 requires.
  • I also need to generate the random numbers within the loop and generate the plots per iteration (so something like first generating the random numbers and then using melt would not work here), since generating the random numbers is emulating a complex database call per iteration of the loop.

I am using RStudio Version 0.98.1091 and R version 3.1.2 (2014-10-31) on Windows 8.1

This is the expected Plot: Expected Plot

This is the plot I am getting when plotting from the list: Plot from the list

My output in PDF form : PDF Output

Please advice if there are any ideas for solutions.

Thank you,

SG

2
1. It appears you're using Sweave to generate the output. I don't see substantial LaTeX in your code, so the Knitr package may be better for your purpose. Knitr is able to output PDF and HTML documents. 2. The links to "Expected Plot" and "Plot from the list" are broken.zhaoy
Thanks for the comment zhaoy. I rectified the code to work with knitr. I don't know why the links for the figures are not working directly. However if you right click and open image in a new tab, the images seem to be there.SGH

2 Answers

1
votes

I don't know why the horizontal line in geom_segment is "moving around" from plot to plot, rather than spanning xmean to xqtl. However, I was able to get the horizontal line in the correct location by getting the value from the stats data frame, rather than from direct calculation of the mean and quantile. You just have to create the stats data frame before the loop, rather than after, so that you can use it in the loop.

  stats <- ldply(statlist, data.frame)

  for (k in seq(1,7)){
    ...

    y <- ggplot(data, aes(x=V1)) + 
        ...
        geom_segment(aes(x=stats[k,1], xend=stats[k,2], y=y_intercept, yend=y_intercept), 
                 colour="red", linetype = "solid", 
                 arrow = arrow(length = unit(0.2, "cm"), ends = "both", type = "closed")) +
        ...

  pltlist[[k]] <- y
  statlist[[k]] <- list(mean=xmean, quantile=xqtl) 
  }

Hopefully, someone else will be able to explain the anomalous behavior, but at least this seems to fix the problem.

For the figure caption, you can add a fig.cap argument to the chunk where you plot the figures, although this results in the same caption for each figure and causes the figures and tables to be plotted in separate groups, rather than interleaved:

<<PrintPlots, warning=FALSE, results='asis', echo=FALSE, cache=TRUE, fig.cap="Caption", fig.height=3.5>>=
for (k in seq(1,7)){
  print(pltlist[[k]])
  print(xtable(stats[k,], caption="Summary Statistics", digits=6))
}
0
votes

You might want to use R Markdown and knitr which is easier than using LaTeX and R (as also zhaoy suggested).

You might also want to check out the ReporteRs package. I think it is actually easier to use than knitr. However, you cannot generate PDFs with it. But you can use pandoc to convert them into PDFs.