I want to obtain the maximum likelihood parameters (MLE) for a cumulative normal curve fitted to some proportion data by direct minimization of the negative log likelihood (without using glm). For some initial values introduced to optim() there is not problem:
x <- c(-0.250, -0.056, 0.137, 0.331, 0.525, 0.719, 0.912, 1.100, 1.300)
k <- c(0, 0, 5, 11, 12, 12, 12, 12, 12)
n <- c(12, 12, 12, 12, 12, 12, 12, 12, 12)
nll <- function(p) {
phi <- pnorm(x, p[1], p[2])
-sum(k * log(phi) + (n - k) * log(1 - phi))
}
para<- optim(c(0.5, 0.1), nll)$par
xseq <- seq(-.5, 1.5, len = 100)
yseq <- pnorm(xseq, para[1],para[2])
curve <- data.frame(xseq, yseq)
dat <- data.frame(x, k, n)
library(ggplot2)
ggplot(dat,aes(x = x, y = k / n)) +
geom_point()+
geom_line(data = curve, aes(x = xseq, y = yseq))
But, if I use initial values that are actually closer to the MLE parameters
para<- optim(c(0.1, 0.1), nll)$par
I obtained the following error:
Error in optim(c(0.1, 0.1), nll) : function cannot be evaluated at initial parameters
It seems that the error is caused by some infinities in the evaluation of the negative log likelihood. I found that if I increase the precision using the log.p
option of pnorm, I don't obtatin the error
nll <- function(p) {
logphi1 <- pnorm(x, p[1], p[2], lower.tail = T, log.p = T)
logphi2 <- pnorm(x, p[1], p[2], lower.tail = F, log.p = T)
-sum(k * logphi1 + (n - k) * logphi2)
}
para<- optim(c(0.1, 0.1), nll)$par
but the problem is that in addition to pnorm
I also want to fit curves that are a + b * pnorm
with a
and b
constants and in those cases I cannot used log.p
to increase the precision.