Optimization in R - Efficient Computation of Objective and Gradient

Question

I need to optimize a set of variables with respect to an objective function. I have the analytical gradient of the function, and would like to use it in the optimization routine. The objective and gradient have some common computations, and I would like to define the functions in the most efficient way possible. The below example demonstrates the issue.

Let f_obj, f_grad and f_common be functions for the objective, gradient and common computations, respectively. The optimization is over the vector x. The below code finds a root of the polynomial y^3 - 3*y^2 + 6*y + 1, where y is a function of c(x[1], x[2]). Note that the function f_common is called in both f_obj and f_grad. In my actual problem the common computation is much longer, so I'm looking for a way to define f_obj and f_grad so that the number of calls to f_common is minimized.

f_common <- function(x) x[1]^3*x[2]^3 - x[2]

f_obj <- function(x) {
  y <- f_common(x)
  return ( (y^3 - 3*y^2 + 6*y + 1)^2 )
}

f_grad <- function(x) {
  y <- f_common(x)
  return ( 2 * (y^3 - 3*y^2 + 6*y + 1) * (3*y^2 - 6*y + 6)* c(3*x[1]^2*x[2]^3, 3*x[1]^3*x[2]^2 - 1) )
}

optim(par = c(100,100), fn = f_obj, gr = f_grad, method = "BFGS")

UPDATE

I find that the package nloptr offers the facility to input the objective function and its gradient as a list. Is there a way to define other optimizers, (optim, optimx, nlminb, etc.) in a similar manner?

Thanks.

What do you want>? The code works. If you want to minimize the call of f_common, you should not call it and instead hard code y within function. — Kota Mori
Hardcoding the common computations in both functions would be the equivalent of calling f_common (in terms of execution time). I'm looking for a way to eliminate redundant computations. With my code, if the optimization routine is to computef_obj at a point x and find its gradient at that point, it would need to call f_common twice, but we only need to compute y once. — user3294195
Maybe you could use a package like memoise to cache the results of f_common. — sgibb

dww dww · Accepted Answer · 2016-03-27T20:53:15

Store the value of the common function in global variable to make it available to subsequent function call, as in following:

f_common <- function(x) x[1]^3*x[2]^3 - x[2]

f_obj <- function(x) {
  y <<- f_common(x)   # <<- operator stores in parent scope
  return ( (y^3 - 3*y^2 + 6*y + 1)^2 )
}

f_grad <- function(x) {
  return ( 2 * (y^3 - 3*y^2 + 6*y + 1) * (3*y^2 - 6*y + 6)* c(3*x[1]^2*x[2]^3, 3*x[1]^3*x[2]^2 - 1) )
}

y<<-0

optim(par = c(100,100), fn = f_obj, gr = f_grad, method = "BFGS")

A couple of notes are worth adding about this solution.

1) Firstly, using the <<- operator, does not strictly speaking assign to a global variable, but rather to one in the parent scope of the function (i.e. the scope from which it was called). Typically this is often global scope. This works fine here and is the better approach. It is also possible to explicitly use global scope using the assign() function, but there is no need for that here.

2) It should also be noted that it is normally not recommended to use global variables, because they can have unexpected side effects if the same variable name is used elsewhere. To avoid any possible side effects, I would suggest using a variable name such as global.f_common that will never be used elsewhere and has no danger of side effects. I simply used the name y in the example to be consistent with the nomenclature in the original question. This is one of the rare occasions where giving a variable scope outside of its function may be justified because it is difficult to achieve the desired behaviour another way. Just make sure you use caution and use a unique name (such as global.f_common) as suggested above.

Optimization in R - Efficient Computation of Objective and Gradient

1 Answers