7
votes

I use glm.nb() function in R MASS package to estimate the parameters of a negative binomial regression model. How could I calculate the predicted probability (probability mass function) given new data, which R function can I use?

My dataset is as follows. y follows negative binomial distribution and x is covariate. And I use glm.nb(y ~ x, data=data) to estimate model parameters. Given new x and y, how can I calculate the predicted probability.

Is there a way to calculate it using Java?

y     x

91    1.000000                                                                                       
79    1.000000

86    1.000000

32    1.000000

41    1.000000

29    0.890609

44    1.000000

42    1.000000

31    0.734058

35    1.000000
1
When you say new data point, you mean a new x and y? (It would be helpful to show a reproducible example)David Robinson

1 Answers

12
votes

Let's say you set up your data like this:

set.seed(1)
x = seq(-2, 8, .01)
y = rnbinom(length(x), mu=exp(x), size=10)
fit = glm.nb(y ~ x)

and you have a new point: you want to find the probability of y=100 given x=5.

You can get the predicted value of y from x using predict (with type="response" to tell it you want it after the inverse of the link function has been applied):

predicted.y = predict(fit, newdata=data.frame(x=5), type="response")

Then you could find out the probability with:

dnbinom(100, mu=predicted.y, size=fit$theta)

(This is using fit$theta, the maximum likelihood estimate of the "size" parameter of the negative binomial).

So in one function:

prob = function(newx, newy, fit) {
    dnbinom(newy, mu=predict(fit, newdata=data.frame(x=newx), type="response"), size=fit$theta)
}