1
votes

I am using hexbin() to bin data into hexagon objects, and ggplot() to plot the results. I notice that, sometimes, the binning data frame contains a different number of hexagons than the plot that results from plotting that same binning data frame. Below is an example.

library(hexbin)
library(ggplot2)

set.seed(1)
data <- data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100), D=rnorm(100), E=rnorm(100))
maxVal = max(abs(data))
maxRange = c(-1*maxVal, maxVal)

x = data[,c("A")]
y = data[,c("E")]
h <- hexbin(x=x, y=y, xbins=5, shape=1, IDs=TRUE, xbnds=maxRange, ybnds=maxRange)
hexdf <- data.frame (hcell2xy (h),  hexID = h@cell, counts = h@count)

# Both objects below indicate there are 17 hexagons
# hexdf
# table(h@cID)

# However, plotting only shows 16 hexagons
ggplot(hexdf, aes(x=x, y=y, fill = counts, hexID=hexID)) + geom_hex(stat="identity") + scale_x_continuous(limits = maxRange) + scale_y_continuous(limits = maxRange)

In this example, the hexdf data frame contains 17 hexagons. However, the ggplot(hexdf) resulting plot only shows 16 hexagons, as is shown below.

ggplot(hexdf) shows 16 hexagons

Note: Syntax in the above example may seem cumbersome, but some of it is because this is a MWE for a more complex goal and I am intentionally keeping those components so that any possible solution might extend to my more complex goal. For instance, I want to maintain the capability to allow for the maxRange variable to be computed from the original data frame called data (which contains additional columns "B", "C", and "D"). At the same time, there may be parts of my syntax that are unnecessarily cumbersome and may be causing the problem - so I am happy to try to fix them to see.

Any ideas what might be causing this discrepancy and how to fix it? Thank you!

1

1 Answers

2
votes

The last hexagon is missing as it's (partly) outside the limits you set. It's included if you change the limits, e.g. like so:

ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) + 
  geom_hex(stat = "identity") + 
  scale_x_continuous(limits = maxRange * 1.5) + 
  scale_y_continuous(limits = maxRange * 1.5)

enter image description here

or by using coord_cartesian instead:

ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) + 
  geom_hex(stat = "identity") + 
  coord_cartesian(xlim = c(maxRange[1], maxRange[2]), ylim = c(maxRange[1], maxRange[2]))

enter image description here