2
votes

I'm trying estimate parameters n and p from Binomial Distribution by Maximum Likelihood in R.

I'm using the function optim from stats package, but there is an error.

That is my code:

xi = rbinom(100, 20, 0.5) # Sample
n = length(xi) # Sample size

# Log-Likelihood
lnlike <- function(theta){
log(prod(choose(theta[1],xi))) + sum(xi*log(theta[2])) + 
(n*theta[1] - sum(xi))*log(1-theta[2])
}

# Optimizing 
optim(theta <- c(10,.3), lnlike, hessian=TRUE)

Error in optim(theta <- c(10, 0.3), lnlike, hessian = TRUE) : function cannot be evaluated at initial parameters

Anyone done this? Which function used?

2
What is the error?duffymo
I don't know how to get to the answer, but I also don't see an error... Could you post it?? Your question should be more focused around how to fix the error over how to get to the solution. (Getting to the solution might be just fixing the error though..)Cayce K
@ZheyuanLi This is an illustrative example for what I need.andre
@ZheyuanLi Thanks for remind me and for the answer.andre

2 Answers

8
votes

tl;dr you're going to get a likelihood of zero (and thus a negative-infinite log-likelihood) if the response variable is greater than the binomial N (which is the theoretical maximum value of the response). In most practical problems, N is taken as known and just the probability is estimated. If you do want to estimate N, you need to (1) constrain it to be >= the largest value in the sample; (2) do something special to optimize over a parameter that must be discrete (this is an advanced/tricky problem).

First part of this answer shows debugging strategies for identifying the problem, second illustrates a strategy for optimizing N and p simultaneously (by brute force over a reasonable range of N).

Setup:

set.seed(101)
n <- 100
xi <- rbinom(n, size=20, prob=0.5) # Sample

Log-likelihood function:

lnlike <- function(theta){
    log(prod(choose(theta[1],xi))) + sum(xi*log(theta[2])) + 
       (n*theta[1] - sum(xi))*log(1-theta[2])
}

Let's break this down.

theta <- c(10,0.3)  ## starting values
lnlike(c(10,0.3))  ## -Inf

OK, the log-likelihood is -Inf at the starting value. Not surprising that optim() can't work with that.

Let's work through the terms.

log(prod(choose(theta[1],xi))) ## -Inf

OK, we're already in trouble on the first term.

prod(choose(theta[1],xi)) ## 0

The product is zero ... why?

choose(theta[1],xi)
##  [1] 120 210  10   0   0  10 120 210   0   0  45 210   1   0

Lots of zeros. Why? What are the values of xi that are problematic?

## [1]  7  6  9 12 11  9  7  6

Aha! We're OK for 7, 6, 9 ... but in trouble with 12.

badvals <- (choose(theta[1],xi)==0)
all(badvals==(xi>10))  ## TRUE

If you really want to do this, you can do it by brute-force enumeration over reasonable values of n ...

## likelihood function
llik2 <- function(p,n) {
    -sum(dbinom(xi,prob=p,size=n,log=TRUE))
}
## possible N values (15 to 50)
nvec <- max(xi):50
Lvec <- numeric(length(nvec))
for (i in 1:length(nvec)) {
    ## optim() wants method="Brent"/lower/upper for 1-D optimization
    Lvec[i] <- optim(par=0.5,fn=llik2,n=nvec[i],method="Brent",
                     lower=0.001,upper=0.999)$val
}
nvec[which.min(Lvec)]  ## 20
par(las=1,bty="l")
plot(nvec,Lvec,type="b")

enter image description here

3
votes

Why you get into trouble?

If you do lnlike(c(10, 0.3)), you get -Inf. That's why your error message is complaining lnlike, rather than optim.

Often, n is known, and only p needs be estimated. In this situation, either moment estimator or maximum likelihood estimator is in closed form, and no numerical optimization is needed. So, it is really weird to estimate n.

If you do want to estimate, you have to be aware that it is constrained. Check

range(xi)  ## 5  15

You observations have range [5, 15], therefore, it is required that n >= 15. How can you pass an initial value 10? The searching direction for n, should be from a large starting value, and then gradually searching downward till it reaches max(xi). So, you might try 30 as the initial value for n.

Additionally, you don't need to define lnlike in the current way. Do this:

lnlike <- function(theta, x) -sum(dbinom(x, size = theta[1], prob = theta[2], log = TRUE))
  • optim is often used for minimization (though it can do maximization). I have put a minus sign in the function to get negative log likelihood. In this way, you are minimizing lnlike w.r.t. theta.
  • You should also pass your observations xi as additional argument to lnlike, rather than taking it from global environment.

Naive try with optim:

In my comment, I already said that I don't believe using optim to estimate n will work, because n must be integers while optim is used for continuous variables. These errors and warnings shall convince you.

optim(c(30,.3), fn = lnlike, x = xi, hessian = TRUE)

Error in optim(c(30, 0.3), fn = lnlike, x = xi, hessian = TRUE) : 
  non-finite finite-difference value [1]
In addition: There were 15 or more warnings (use warnings() to see the 
first 15

> warnings()
Warning messages:
1: In dbinom(x, size = theta[1], prob = theta[2], log = TRUE) : NaNs produced
2: In dbinom(x, size = theta[1], prob = theta[2], log = TRUE) : NaNs produced
3: In dbinom(x, size = theta[1], prob = theta[2], log = TRUE) : NaNs produced
4: In dbinom(x, size = theta[1], prob = theta[2], log = TRUE) : NaNs produced
5: In dbinom(x, size = theta[1], prob = theta[2], log = TRUE) : NaNs produced

Solution?

Ben has provided you a way. Instead of letting optim to estimate n, we manually do a grid search for n. For each candidate n, we perform a univariate optimization w.r.t. p. (Oops, in fact, there is no need to do numerical optimization here.) In this way, you are getting a profile likelihood of n. Then, we find n on the grid to minimize this profile likelihood.

Ben has provided you full details, and I shall not repeat that. Nice (and swift) work, Ben!