Maximum Likelihood Estimation for three-parameter Weibull distribution in r

Question

I want to estimate the scale, shape and threshold parameters of a 3p Weibull distribution.

What I've done so far is the following:

Refering to this post, Fitting a 3 parameter Weibull distribution in R

I've used the functions

    EPS = sqrt(.Machine$double.eps) # "epsilon" for very small numbers

llik.weibull <- function(shape, scale, thres, x)
{ 
  sum(dweibull(x - thres, shape, scale, log=T))
}

thetahat.weibull <- function(x)
{ 
  if(any(x <= 0)) stop("x values must be positive")

  toptim <- function(theta) -llik.weibull(theta[1], theta[2], theta[3], x)

  mu = mean(log(x))
  sigma2 = var(log(x))
  shape.guess = 1.2 / sqrt(sigma2)
  scale.guess = exp(mu + (0.572 / shape.guess))
  thres.guess = 1

  res = nlminb(c(shape.guess, scale.guess, thres.guess), toptim, lower=EPS)

  c(shape=res$par[1], scale=res$par[2], thres=res$par[3])
}

to "pre-estimate" my Weibull parameters, such that I can use them as initial values for the argument "start" in the "fitdistr" function of the MASS-Package.

You might ask why I want to estimate the parameters twice... reason is that I need the variance-covariance-matrix of the estimates which is also estimated by the fitdistr function.

EXAMPLE:

    set.seed(1)

    thres <- 450
    dat <- rweibull(1000, 2.78, 750) + thres

pre_mle <- thetahat.weibull(dat)

    my_wb <- function(x, shape, scale, thres) { 
        dweibull(x - thres, shape, scale)
    }

    ml <- fitdistr(dat, densfun = my_wb, start = list(shape = round(pre_mle[1], digits = 0), scale = round(pre_mle[2], digits = 0), 
    thres = round(pre_mle[3], digits = 0)))

    ml

     > ml
       shape        scale        thres   
       2.942548   779.997177   419.996196   (  0.152129) ( 32.194294) ( 28.729323)

     > ml$vcov
                shape       scale       thres
    shape  0.02314322    4.335239   -3.836873
    scale  4.33523868 1036.472551 -889.497580
    thres -3.83687258 -889.497580  825.374029

This works quite well for cases where the shape parameter is above 1. Unfortunately my approach should deal with the cases where the shape parameter could be smaller than 1.

The reason why this is not possible for shape parameters that are smaller than 1 is described here: http://www.weibull.com/hotwire/issue148/hottopics148.htm

in Case 1, All three parameters are unknown the following is said:

"Define the smallest failure time of ti to be tmin. Then when γ → tmin, ln(tmin - γ) → -∞. If β is less than 1, then (β - 1)ln(tmin - γ) goes to +∞ . For a given solution of β, η and γ, we can always find another set of solutions (for example, by making γ closer to tmin) that will give a larger likelihood value. Therefore, there is no MLE solution for β, η and γ."

This makes a lot of sense. For this very reason I want to do it the way they described it on this page.

"In Weibull++, a gradient-based algorithm is used to find the MLE solution for β, η and γ. The upper bound of the range for γ is arbitrarily set to be 0.99 of tmin. Depending on the data set, either a local optimal or 0.99tmin is returned as the MLE solution for γ."

I want to set a feasible interval for gamma (in my code called 'thres') such that the solution is between (0, .99 * tmin).

Does anyone have an idea how to solve this problem?

In the function fitdistr there seems to be no opportunity doing a constrained MLE, constraining one parameter.

Another way to go could be the estimation of the asymptotic variance via the outer product of the score vectors. The score vector could be taken from the above used function thetahat.weibul(x). But calculating the outer product manually (without function) seems to be very time consuming and does not solve the problem of the constrained ML estimation.

Best regards, Tim

Ben Bolker Ben Bolker · Accepted Answer · 2016-07-31T23:15:00

It's not too hard to set up a constrained MLE. I'm going to do this in bbmle::mle2; you could also do it in stats4::mle, but bbmle has some additional features.

The larger issue is that it's theoretically difficult to define the sampling variance of an estimate when it's on the boundary of the allowed space; the theory behind Wald variance estimates breaks down. You can still calculate confidence intervals by likelihood profiling ... or you could bootstrap. I ran into a variety of optimization issues when doing this ... I haven't really thought about wether there are specific reasons

Reformat three-parameter Weibull function for mle2 use (takes x as first argument, takes log as an argument):

dweib3 <- function(x, shape, scale, thres, log=TRUE) { 
    dweibull(x - thres, shape, scale, log=log)
}

Starting function (slightly reformatted):

weib3_start <- function(x) {
   mu <- mean(log(x))
   sigma2 <- var(log(x))
   logshape  <- log(1.2 / sqrt(sigma2))
   logscale <- mu + (0.572 / logshape)
   logthres <- log(0.5*min(x))
   list(logshape = logshape, logsc = logscale, logthres = logthres)
}

Generate data:

set.seed(1)
dat <- data.frame(x=rweibull(1000, 2.78, 750) + 450)

Fit model: I'm fitting the parameters on the log scale for convenience and stability, but you could use boundaries at zero as well.

tmin <- log(0.99*min(dat$x))
library(bbmle)
m1 <- mle2(x~dweib3(exp(logshape),exp(logsc),exp(logthres)),
           data=dat,
           upper=c(logshape=Inf,logsc=Inf,
                   logthres=tmin),
           start=weib3_start(dat$x),
           method="L-BFGS-B")

vcov(m1), which should normally provide a variance-covariance estimate (unless the estimate is on the boundary, which is not the case here) gives NaN values ... not sure why without more digging.

library(emdbook)
tmpf <- function(x,y) m1@minuslogl(logshape=x,
                                         logsc=coef(m1)["logsc"],
                                         logthres=y)
tmpf(1.1,6)
s1 <- curve3d(tmpf,
              xlim=c(1,1.2),ylim=c(5.9,tmin),sys3d="image")
with(s1,contour(x,y,z,add=TRUE))

h <- lme4:::hessian(function(x) do.call(m1@minuslogl,as.list(x)),coef(m1))
vv <- solve(h)
diag(vv) ## [1] 0.002672240 0.001703674 0.004674833
(se <- sqrt(diag(vv))) ## standard errors
## [1] 0.05169371 0.04127558 0.06837275
cov2cor(vv)
##            [,1]       [,2]       [,3]
## [1,]  1.0000000  0.8852090 -0.8778424
## [2,]  0.8852090  1.0000000 -0.9616941
## [3,] -0.8778424 -0.9616941  1.0000000

This is the variance-covariance matrix of the log-scaled variables. If you want to convert to the variance-covariance matrix on the original scale, you need to scale by (x_i)*(x_j) (i.e. by the derivatives of the transformation exp(x)).

outer(exp(coef(m1)),exp(coef(m1))) * vv
##             logshape       logsc    logthres
## logshape  0.02312803    4.332993   -3.834145
## logsc     4.33299307 1035.966372 -888.980794
## logthres -3.83414498 -888.980794  824.831463

I don't know why this doesn't work with numDeriv - would be very careful with variance estimates above. (Maybe too close to boundary for Richardson extrapolation to work?)

library(numDeriv)
hessian()
grad(function(x) do.call(m1@minuslogl,as.list(x)),coef(m1))  ## looks OK
vcov(m1)

The profiles look OK ... (we have to supply std.err because the Hessian isn't invertible)

pp <- profile(m1,std.err=c(0.01,0.01,0.01))
par(las=1,bty="l",mfcol=c(1,3))
plot(pp,show.points=TRUE)

confint(pp)
##              2.5 %   97.5 %
## logshape 0.9899645 1.193571
## logsc    6.5933070 6.755399
## logthres 5.8508827 6.134346

Alternately, we can do this on the original scale ... one possibility would be to use the log-scaling to fit, then refit starting from those parameters on the original scale.

wstart <- as.list(exp(unlist(weib3_start(dat$x))))
names(wstart) <- gsub("log","",names(wstart))
m2 <- mle2(x~dweib3(shape,sc,thres),
           data=dat,
           lower=c(shape=0.001,sc=0.001,thres=0.001),
           upper=c(shape=Inf,sc=Inf,
                   thres=exp(tmin)),
           start=wstart,
           method="L-BFGS-B")
vcov(m2)
##             shape          sc       thres
## shape  0.02312399    4.332057   -3.833264
## sc     4.33205658 1035.743511 -888.770787
## thres -3.83326390 -888.770787  824.633714
all.equal(unname(coef(m2)),unname(exp(coef(m1))),tol=1e-4)

About the same as the values above.

We can fit with a small shape, if we are a little more careful to bound the paraameters, but now we end up on the boundary for the threshold, which will cause lots of problems for the variance calculations.

set.seed(1)
dat <- data.frame(x = rweibull(1000, .53, 365) + 100)
tmin <- log(0.99 * min(dat$x))
m1 <- mle2(x ~ dweib3(exp(logshape), exp(logsc), exp(logthres)),
   lower=c(logshape=-10,logscale=0,logthres=0),
   upper = c(logshape = 20, logsc = 20, logthres = tmin),
   data = dat, 
   start = weib3_start(dat$x), method = "L-BFGS-B")

For censored data, you need to replace dweibull with pweibull; see Errors running Maximum Likelihood Estimation on a three parameter Weibull cdf for some hints.

Maximum Likelihood Estimation for three-parameter Weibull distribution in r

2 Answers