1
votes

Suppose f(k) = exp(k/200) - 1 and we want to minimize ( f(a) + f(b) + f(c) + f(d) - pi )^2. The solution should be a = 6, b = 75, c = 89, d = 226. The sum of squares for this solution is ~ 8e-17.

sumsq <- function(theta, n=200) {
     f <- function(k) exp(k/n) - 1
     (f(theta[1]) + f(theta[2]) + f(theta[3]) + f(theta[4]) - pi)^2
}
theta <- optim(par=c(10, 90, 70, 300), fn=sumsq)
# theta$par = 62.97 106.89, 78.64, 189.82
# theta$value = 6.32e-10
# sumsq(c(6,75,89,226)) = 8.20e-17

So clearly, the solution of a = 6, b = 75, c = 89, d = 226 is better than the one the optim function gave by comparing the sum of squares. I would to know how to make R more accurate with its optimization technique. I have also tried the nlm() function, without success.

The value of pi used is 3.1415926535897931 - I think that the accuracy of pi is not the reason why the optim function isn't producing an optimal solution

3
I'm not really concerned about this particular problem, but about how to improve the accuracy of R optimization techniques in general - nathanesau
R thinks those numbers are not different: all.equal(theta$value, sumsq(c(6,75,89,226))) and are equal to zero all.equal(theta$value,0). Can you scale the function to make the differences more obvious to a numerical solver? - MrFlick
I would guess that the problem has several solutions. I suspect that if you take different starting conditions you'll find a different minimum. This would be characteristic for numerical optimization problems and not related to R. - RHertel
But that's strange - I could brute force the solution over the natural numbers, 226^4 iterations and arrive at the result (a trivial solution). Surely there must be a way to improve the accuracy that optim looks for? - nathanesau
Basically you are trying to find solutions for f(a)+f(b)+f(c)+f(d)=pi. I'm not a mathematician but I think it's safe to assume that there is an infinite amount of combinations (a,b,c,d) that fulfill this condition. In view of this, finding the one combination that you are looking for would be just a coincidence. - RHertel

3 Answers

2
votes

As the commenters say, this isn't a problem with the accuracy of optim, but rather that the algorithm used by optim may not be suitable for your particular problem. There are very many optimization packages and interfaces available in R; I have had good results using the rgenoud package to improve maximum likelihood-based parameter estimates with the fitdist packages (which I believe uses optim by default).

The other question, of course, is whether the problem your are posing actually has a global minimum that is distinguishable from other local minimums within the numerical tolerance you can specify/R can detect. 6.32e-10 and 8.20e-17 are both pretty small and far beyond the numerical tolerances I consider acceptable in my work... but I don't know about your field.

2
votes

This is not a well-posed minimization problem. There is an infinite amount of possible solutions. One of them is

a=b=c=d=200*log(1+pi/4)

which numerically is approximately

115.92829021682383

The residual sumsq in this case is zero (within the numerical accuracy) if you insert the numbers.

The problem would probably be far more complex to solve if one would impose, e.g., the restriction that only natural or only integer numbers are allowed. In that case, your combination (and permutations thereof) might be the best, but at the moment I wouldn't know how to verify this. A minimization in the presence of such a constraint would be a qualitatively different problem, which might be interesting for mathematicians. In any case, the usual numerical optimization algorithms won't allow to introduce such a constraint.

1
votes

I used the "BFGS" method:

sumsq <- function(theta, n=200) {
  f <- function(k) exp(k/n) - 1
  (f(theta[1]) + f(theta[2]) + f(theta[3]) + f(theta[4]) - pi)^2
}
theta <- optim(par=c(10, 90, 70, 300), fn=sumsq, method="BFGS")

Look at the result:

> theta
$par
[1]  -2.629695  71.159586  52.952260 246.174513

$value
[1] 4.009243e-22

$counts
function gradient 
      19        8 

$convergence
[1] 0