I'm trying to fit a negative binomial glm for two different conditions to my data. first, some toy data :
value times variable
1 82.21236 0.0000000 B
2 130.69185 0.0000000 A
3 159.10491 1.3131313 B
4 136.94357 0.6060606 A
5 192.22455 3.1313131 B
6 149.96539 3.1313131 A
7 115.91152 4.5454545 B
8 95.26077 4.2424242 A
9 73.79734 6.2626263 B
10 71.43359 6.1616162 A
11 106.83029 7.4747475 B
12 134.01414 7.0707071 A
13 44.66716 8.6868687 B
14 57.47017 8.6868687 A
15 41.02301 9.8989899 B
16 42.47003 9.4949495 A
17 66.26286 0.0000000 B
18 122.70818 0.0000000 A
19 187.01966 1.6161616 B
20 199.92595 1.6161616 A
21 138.26999 2.9292929 B
22 94.63155 3.2323232 A
23 149.99105 4.5454545 B
24 121.49791 4.1414141 A
25 107.17931 5.6565657 B
26 91.04130 5.7575758 A
27 84.03087 7.7777778 B
28 62.17754 7.6767677 A
29 52.81123 8.9898990 B
30 72.61422 7.5757576 A
31 52.33281 10.0000000 B
32 39.60495 9.6969697 A
(the libraries : library(ggplot2) library(MASS) )
My aim is to retrieve the model that is fit to my data for downstream analysis, no just to visualize it, so i first went for the glm.nb function of the MASS package, but it fails to fit the data and I don't know why, especially since the same method succeeds with ggplot. here is the code i've used so far :
ans = glm.nb(data = data, formula = value~splines::bs(times,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F ):variable)
data$glm_nb = predict(ans) #make the glm model and predict the new values
p=ggplot(data, aes(x=times, y=value, group=variable)) + #plot it
facet_grid(.~variable)+theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(x="times",y="Value")+
geom_point(size=2,alpha = 0.2) + theme_bw(base_size = 22)+
stat_smooth(method = "glm.nb", formula = y~splines::bs(x,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F), color = "green", size = 0.3)+ #ggplot with the same model
geom_line(aes(x = times, y = glm_nb), color = "blue")
both models, the ggplot and the other ones, give warnings that my x values are no integers, but ggplot still succeeds to fit the data.
What is especially wierd is that when I try to use a GLM, it actually works ! (same code, just swapped glm.nb for glm). I tried to look up the source code to see what does geom_smooth exactly do, but i couldn't find the precise line where it calculates its models. Any ideas ?