0
votes

I would like to superimpose, on each lattice histogram panel, an additional histogram (which will be the same in each panel). I want the overlayed histogram to have solid borders but empty fill (col), to allow comparison with the underlying histograms.

That is, the end result will be a series of panels, each with a different colored histogram, and each with the same extra outline histogram on top of the colored histogram.

Here's something that I tried, but it just produces empty panels:

foo.df <- data.frame(x=rnorm(40), categ=c(rep("A", 20), rep("B", 20)))
bar.df <- data.frame(x=rnorm(20))
histogram(~ x | categ, data=foo.df,
          panel=function(...){histogram(...);
                              histogram(~ x, data=bar.df, col=NULL)})

(My guess is that I need to use panel.superpose, but this function is somewhat confusing. Sarkar's book doesn't explain how to use it, and the R help page has no examples. I'm finding it difficult to make sense of the panel.superpose help page without already having a basic understanding. There are a very small number of examples that I've found on the web, but I have been unable to figure out what aspects of those examples apply to my case. This answer is surely relevant, but I don't understand its use of panel.groups, and the example overlays three different groups from a single dataframe, whereas I want to repeatedly overlay the same data on multiple panels that also have different data .)

1
I illustrate another partial solution here, where I ask a question designed to help answer this one. If I learn how to provide a complete solution to this question here before anyone else answers, I'll post it as an answer.Mars

1 Answers

0
votes

I continued working on this problem, and came up with an answer. I had been on the right track but got several crucial details wrong. Comments in the code below spell out important points.

# Main data, which will be displayed as solid histograms, different in each panel:
foo.df <- data.frame(y=rnorm(40), cat=c(rep("A", 20), rep("B", 20)))
# Comparison data: This will be displayed as an outline histogram in each panel:
bar.df <- data.frame(y=rnorm(30)-2)

# Define some vectors that we'll use in the histogram call.
# These have to be adjusted for the data by trial and error.
# Usually, panel.histogram will figure out reasonable default values for these.
# However, the two calls to panel.histogram below may figure out different values,
# producing pairs of histograms that aren't comparable.
bks <- seq(-5,3,0.5)  # breaks that define the bar bins
yl <- c(0,50)         # height of plot

# The key is to coordinate breaks in the two panel.histogram calls below.
# The first one inherits the breaks from the top-level call through '...' .
# Using "..." in the second call generates an error, so I specify parameters explicitly.
# It's not necessary to specify type="percent" at the top level, since that's the default,
# but it is necessary to specify it in the second panel.histogram call.
histogram(~ y | cat, data=foo.df, ylim=yl, breaks=bks, type="percent", border="cyan",
          panel=function(...){panel.histogram(...)
                              panel.histogram(x=bar.df$y, col="transparent",
                                              type="percent", breaks=bks)})

# col="transparent" is what makes the second set of bars into outlines.
# In the first set of bars, I set the border color to be the same as the value of col
# (cyan by default) rather than using border="transparent" because otherwise a filled
# bar with the same number of points as an outline bar will be slightly smaller.

Two panels, each containing the same outline histogram superimposed on a different cyan histograms