The gsl
library documentation about the multidimensional minimization algorithms reads:
You must provide a parametric function of n variables for the minimizers to operate on. You may also need to provide a routine which calculates the gradient of the function and a third routine which calculates both the function value and the gradient together.
The example provided define such functions as follows (I omitted problem specific details, replaced by ...
):
The function f
itself
double
my_f (const gsl_vector *v, void *params)
{
...
return rv;
}
The gradient of f
, df
= (df/dx, df/dy).
void
my_df (const gsl_vector *v, void *params, gsl_vector *df)
{
...
gsl_vector_set(df, ...);
gsl_vector_set(df, ...);
}
And finally, the third function to compute both f
and df
together
void
my_fdf (const gsl_vector *x, void *params, double *f, gsl_vector *df)
{
*f = my_f(x, params);
my_df(x, params, df);
}
These three are members of a struct
type gsl_multimin_function_fdf
, which is eventually passed to the minimizer.
There are several cases in which once the function value is calculated, its derivative may be more easily calculated, e.g.: Let f(x,y) = exp(x * g(y))
, where g(y)
may be expensive to compute, then it's convenient to do simply df/dx = g(y) f(x,y)
using g(y) = log(f)/x
.
Now, as far as I can learn from the example, the minimizer requires the function and its derivative to be defined independently, while the third definition looks like a dummy wrapper.
Is it possible to define these functions in a way such that the function and its derivative can actually be calculated within the same scope?
Edit:
In the documentation, regarding fdf
, it is stated
This function provides an optimization of the separate functions for
f(x)
andg(x)
—it is always faster to compute the function and its derivative at the same time.
Yet, I'm not certain how. Scanning through the header, I found there are three macros defined, one for each of these three functions
#define GSL_MULTIMIN_FN_EVAL_F(F,x) (*((F)->f))(x,(F)->params)
#define GSL_MULTIMIN_FN_EVAL_DF(F,x,g) (*((F)->df))(x,(F)->params,(g))
#define GSL_MULTIMIN_FN_EVAL_F_DF(F,x,y,g) (*((F)->fdf))(x,(F)->params,(y),(g))
which seem to be called alternatively, depending on the optimization algorithm used. Could someone confirm this, please? Back to my original question, does this imply that the library user has to check the source to find out what method to use in order to take advantage of the possibility of computing both the function value and its gradient together?