I'm trying to run a gam in R and I'm getting a strange error message.
Generally, I have some number of counts, per volume of water sampled, and I want to correct by that number of counts. I'm trying to generate a smooth function that fits the counts as a function of depth, accounting for differences in volume sampled.
test <- structure(list(depth = c(2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5,
37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, 72.5, 77.5, 82.5, 87.5,
92.5, 97.5), count = c(53323, 665, 1090, 491, 540, 514, 612,
775, 601, 497, 295, 348, 357, 294, 292, 968, 455, 148, 155, 101
), vol = c(2119.92, 111.76, 156.64, 71.28, 77.44, 73.92, 62.48,
78.32, 74.8, 81.84, 53.68, 80.96, 80.08, 79.2, 79.2, 77.44, 77.44,
84.48, 73.04, 59.84)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -20L), .Names = c("depth", "count", "vol"
))
gam(count ~ s(depth) + offset(vol), data = test, family = "poisson")
Error in if (pdev - old.pdev > div.thresh) { : missing value where TRUE/FALSE needed
Any idea why this is not working? If I get rid of the offset, or if I set family = "gaussian"
the function runs as one would expect.
Edit: I find that
gam(count ~ s(depth) + offset(log(vol)), data = test, family = "poisson")
does run, and I think I saw something that said that one wants to log transform the offset variable for these, so maybe this is actually working ok.