3
votes

My data frame looks like:

head(bush_status)
distance  status count
       0 endemic   844
       1 exotic     8
       5 native     3
      10 endemic    5
      15 endemic    4
      20 endemic    3

The count data is non-normally distributed. I'm trying to fit a generalized additive model to my data in two ways so i can use anova to see if the p-value supports m2.

m1 <- gam(count ~ s(distance) + status, data=bush_status, family="nb")
m2 <- gam(count ~ s(distance, by=status) + status, data=bush_status, family="nb")

m1 works fine, but m2 sends the error message:

"Error in smoothCon(split$smooth.spec[[i]], data, knots, absorb.cons, 
scale.penalty = scale.penalty,  : 
  Can't find by variable"

This is pretty beyond me so if anyone could offer any advice that would be much appreciated!

1
m1 doesn't work fine using your example data. It returns "Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : A term has fewer unique covariate combinations than specified maximum degrees of freedom".neilfws
Is status a factor variable? Please provide dput(bush_status).Roland
@Roland status is a character variable - does it need to be a factor? dput() gives a very long output but last part reads: Names = c("distance", "status", "count"), row.names = c(NA, -702L), class = "data.frame")Fbj9506
Yes, the variable passed to by must be a factor variable in mgcv.Roland
@Roland it works! Thank you!Fbj9506

1 Answers

13
votes

From your comments it became clear that you passed a character variable to by in the smoother. You must pass a factor variable there. This has been a frequent gotcha for me too and I consider it a design flaw (because base R regression functions deal with character variables just fine).