3
votes

I would like to perform nonlinear least squares regression in R where I simultaneously minimize the squared residuals of three models (see below). Now, the three models share some of the parameters, in my example, parameters b and d.

Is there a way of doing this with either nls(), or, either packages minpack.lm or nlsr?

So, ideally, I would like to generate the objective function (the sum of least squares of all models together) and regress all parameters at once: a1, a2, a3, b, c1, c2, c3 and d.

(I am trying to avoid running three independent regressions and then perform some averaging on b and d.)

my_model <- function(x, a, b, c, d) {
  a * b ^ (x - c) + d
}

# x values
x <- seq(0, 10, 0.2)

# Shared parameters
b <- 2
d <- 10

a1 <- 1
c1 <- 1
y1 <- my_model(x,
               a = a1,
               b = b,
               c = c1,
               d = d) + rnorm(length(x))

a2 <- 2
c2 <- 5
y2 <- my_model(x,
               a = a2,
               b = b,
               c = c2,
               d = d) + rnorm(length(x))

a3 <- -2
c3 <- 3
y3 <- my_model(x,
               a = a3,
               b = b,
               c = c3,
               d = d) + rnorm(length(x))

plot(
  y1 ~ x,
  xlim = range(x),
  ylim = d + c(-50, 50),
  type = 'b',
  col = 'red',
  ylab = 'y'
)
lines(y2 ~ x, type = 'b', col = 'green')
lines(y3 ~ x, type = 'b', col = 'blue')

2
I am confident about the statistical approach. My question is really about the implementation in R.plant

2 Answers

3
votes

Below we run nls (using a slightly modified model) and nlxb (from nlsr) but nlxb stops before convergence. Desite these problems both of these nevertheless do give results which visually fit the data well. These problems suggest that there are problems with the model itself so in the Other section, guided by the nlxb output, we show how to fix the model giving a submodel of the original model which fits the data easily with both nls and nlxb and also gives a good fit. At the end in the Notes section we provide the data in reproducible form.

nls

Assuming the setup shown reproducibly in the Note at the end, reformulate the problem for the nls plinear algorithm by defining a right hand side matrix whose columns multiply each of the linear parameters, a1, a2, a3 and d, respectively. plinear does not require starting values for those simplifying the setup. It will report them as .lin1, .lin2, .lin3 and .lin4 respectively.

To get starting values we used a simpler model with no grouping and a grid search over b from 1 to 10 and c also from 1 to 10 using nls2 in the package of the same name. We also found that nls still produced errors but by using abs in the formula, as shown, it ran to completion.

The problems with the model suggest that there is a fundamental problem with it and in the Other section we discuss how to fix it up.

xx <- c(x, x, x)
yy <- c(y1, y2, y3)

# startingi values using nls2
library(nls2)
fo0 <- yy ~ cbind(b ^ abs(xx - c), 1)
st0 <- data.frame(b = c(1, 10), c = c(1, 10))
fm0 <- nls2(fo0, start = st0, alg = "plinear-brute")

# run nls using starting values from above
g <- rep(1:3, each = length(x))   
fo <- yy ~ cbind((g==1) * b ^ abs(xx - c[g]), 
                 (g==2) * b ^ abs(xx - c[g]),  
                 (g==3) * b ^ abs(xx - c[g]), 
                 1) 
st <- with(as.list(coef(fm0)), list(b = b, c = c(c, c, c)))
fm <- nls(fo, start = st, alg = "plinear")

plot(yy ~ xx, col = g)
for(i in unique(g)) lines(predict(fm) ~ xx, col = i, subset = g == i)

fm

giving:

Nonlinear regression model
  model: yy ~ cbind((g == 1) * b^abs(xx - c[g]), (g == 2) * b^abs(xx -     c[g]), (g == 3) * b^abs(xx - c[g]), 1)
   data: parent.frame()
     b     c1     c2     c3  .lin1  .lin2  .lin3  .lin4 
 1.997  0.424  1.622  1.074  0.680  0.196 -0.532  9.922 
 residual sum-of-squares: 133

Number of iterations to convergence: 5 
Achieved convergence tolerance: 5.47e-06

(continued after plot)

screenshot

nlsr

With nlsr it would be done like this. No grid search for starting values was needed and adding abs was not needed either. The b and d values seem similar to the nls solution but the other coefficients differ. Visually both solutions seem to fit the data.

On the other hand from the JSingval column we see that the jacobian is rank deficient which caused it to stop and not produce SE values and the convergence is in doubt (although it may be sufficient given that visually the plot, not shown, seems like a good fit). We discuss how to fix this up in the Other section.

g1 <- g == 1; g2 <- g == 2; g3 <- g == 3
fo2 <- yy ~ g1 * (a1 * b ^ (xx - c1) + d) + 
            g2 * (a2 * b ^ (xx - c2) + d) + 
            g3 * (a3 * b ^ (xx - c3) + d)
st2 <- list(a1 = 1, a2 = 1, a3 = 1, b = 1, c1 = 1, c2 = 1, c3 = 1, d = 1)
fm2 <- nlxb(fo2, start = st2)
fm2

giving:

vn: [1] "yy" "g1" "a1" "b"  "xx" "c1" "d"  "g2" "a2" "c2" "g3" "a3" "c3"
no weights
nlsr object: x 
residual sumsquares =  133.45  on  153 observations
    after  16    Jacobian and  22 function evaluations
  name            coeff          SE       tstat      pval      gradient    JSingval   
a1               3.19575            NA         NA         NA    9.68e-10        4097  
a2               0.64157            NA         NA         NA   8.914e-11       662.5  
a3              -1.03096            NA         NA         NA  -1.002e-09       234.9  
b                1.99713            NA         NA         NA   -2.28e-08       72.57  
c1               2.66146            NA         NA         NA   -2.14e-09       10.25  
c2               3.33564            NA         NA         NA  -3.955e-11   1.585e-13  
c3                2.0297            NA         NA         NA  -7.144e-10   1.292e-13  
d                9.92363            NA         NA         NA  -2.603e-12   3.271e-14  

We can calculate SE's using nls2 as a second stage but this still does not address the problem with the whole lthing that the singular values suggest.

summary(nls2(fo2, start = coef(fm2), algorithm = "brute-force"))

giving:

Formula: yy ~ g1 * (a1 * b^(xx - c1) + d) + g2 * (a2 * b^(xx - c2) + d) + 
    g3 * (a3 * b^(xx - c3) + d)

Parameters:
    Estimate Std. Error t value Pr(>|t|)    
a1  3.20e+00   5.38e+05     0.0        1    
a2  6.42e-01   3.55e+05     0.0        1    
a3 -1.03e+00   3.16e+05     0.0        1    
b   2.00e+00   2.49e-03   803.4   <2e-16 ***
c1  2.66e+00   9.42e-02    28.2   <2e-16 ***
c2  3.34e+00   2.43e+05     0.0        1    
c3  2.03e+00   8.00e+05     0.0        1    
d   9.92e+00   4.42e+05     0.0        1    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.959 on 145 degrees of freedom

Number of iterations to convergence: 8 
Achieved convergence tolerance: NA

Other

When nls has trouble fitting a model it often suggests that there is something wrong with the model itself. Playing around with it a bit, guided by the JSingval column in nlsr output above which suggests that c parameters or d might be the problem, we find that if we fix all c parameter values to 0 then the model is easy to fit given sufficiently good starting values and it still gives a low residual sum of squares.

library(nls2)

fo3 <- yy ~ cbind((g==1) * b ^ xx, (g==2) * b ^ xx, (g==3) * b ^ xx, 1) 
st3 <-  coef(fm0)["b"]
fm3 <- nls(fo3, start = st3, alg = "plinear")

giving:

Nonlinear regression model
  model: yy ~ cbind((g == 1) * b^xx, (g == 2) * b^xx, (g == 3) * b^xx,     1)
   data: parent.frame()
      b   .lin1   .lin2   .lin3   .lin4 
 1.9971  0.5071  0.0639 -0.2532  9.9236 
 residual sum-of-squares: 133

Number of iterations to convergence: 4 
Achieved convergence tolerance: 1.67e-09

which the following anova indicates is comparable to fm from above despite having 3 fewer parameters:

anova(fm3, fm)

giving:

Analysis of Variance Table

Model 1: yy ~ cbind((g == 1) * b^xx, (g == 2) * b^xx, (g == 3) * b^xx, 1)
Model 2: yy ~ cbind((g == 1) * b^abs(xx - c[g]), (g == 2) * b^abs(xx - c[g]), (g == 3) * b^abs(xx - c[g]), 1)
  Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
1    148        134                         
2    145        133  3  0.385    0.14   0.94

We can redo fm3 using nlxb like this:

fo4 <- yy ~ g1 * (a1 * b ^ xx + d) + 
            g2 * (a2 * b ^ xx + d) + 
            g3 * (a3 * b ^ xx + d)
st4 <- list(a1 = 1, a2 = 1, a3 = 1, b = 1, d = 1)
fm4 <- nlxb(fo4, start = st4)
fm4

giving:

nlsr object: x 
residual sumsquares =  133.45  on  153 observations
    after  24    Jacobian and  33 function evaluations
  name            coeff          SE       tstat      pval      gradient    JSingval   
a1              0.507053      0.005515      91.94  1.83e-132   8.274e-08        5880  
a2             0.0638554     0.0008735      73.11  4.774e-118    1.26e-08        2053  
a3             -0.253225      0.002737     -92.54  7.154e-133  -4.181e-08        2053  
b                1.99713      0.002294      870.6  2.073e-276   -2.55e-07       147.5  
d                9.92363       0.09256      107.2  3.367e-142  -1.219e-11       10.26  

Note

The assumed input below is the same as in the question except we additionally set the seed to make it reproducible.

set.seed(123)

my_model <- function(x, a, b, c, d) a * b ^ (x - c) + d

x <- seq(0, 10, 0.2)

b <- 2; d <- 10 # shared

a1 <- 1; c1 <- 1
y1 <- my_model(x, a = a1, b = b, c = c1, d = d) + rnorm(length(x))

a2 <- 2; c2 <- 5
y2 <- my_model(x, a = a2, b = b, c = c2, d = d) + rnorm(length(x))

a3 <- -2; c3 <- 3
y3 <- my_model(x, a = a3, b = b, c = c3, d = d) + rnorm(length(x))
1
votes

I'm not sure this is really the best way, but you could minimize the sum of the squared residuals using optim().

#start values
params <- c(a1=1, a2=1, a3=1, b=1, c1=1, c2=1, c3=1,d=1)
# minimize total sum of squares of residuals
fun <- function(p) {
  sum(
    (y1-my_model(x, p["a1"], p["b"], p["c1"], p["d"]))^2 + 
    (y2-my_model(x, p["a2"], p["b"], p["c2"], p["d"]))^2 +
    (y3-my_model(x, p["a3"], p["b"], p["c3"], p["d"]))^2
  )
}
out <- optim(params, fun, method="BFGS")
out$par
#         a1         a2         a3          b         c1         c2         c3 
#  0.8807542  1.0241804 -2.8805848  1.9974615  0.7998103  4.0030597  3.5184600 
#          d 
#  9.8764917 

And we can add the plots on top of the image

curve(my_model(x, out$par["a1"], out$par["b"], out$par["c1"], out$par["d"]), col="red", add=T)
curve(my_model(x, out$par["a2"], out$par["b"], out$par["c2"], out$par["d"]), col="green", add=T)
curve(my_model(x, out$par["a3"], out$par["b"], out$par["c3"], out$par["d"]), col="blue", add=T)

enter image description here