To start understanding how scipy curve_fit
tries to solve a problem, start with reading about non-linear optimization methods such as the Levenberg-Marquardt (LM) method (see https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm). This is the method curve_fit
uses when bounds on parameter values are not specified, and it will give most of the answers you seek.
This method (and the other solvers in scipy.optimize
) start with initial values and try to refine these values in order to find the minimal resulting residual array (in the least-squares sense: to be clear, the method expects a residual array and does the squaring and summing itself).
To find which direction to move parameter values, it uses the first derivative (of the residual wrt to the variables that is) or Jacobian. It it very common to not have an analytic form for this, in which case numerical derivatives are calculated by taking very small steps (ofteen around machine precision) in parameter values.
With these derivatives and the last step size the LM method decides how big of a step to make to find the minimal residual. It is a common (and often a good) approximation to think of the residual as a quadratic function of each parameter. If the value is far from the bottom, taking a step that is linear in the derivative (ie, just following the local slope) is pretty good. But near the end solution, it is helpful to use the 2nd derivative (aka Hessian, curvature, or in some sense "history"). LM combines these two approaches to try to quickly find the solution. At each step, it first calculates the derivatives (and second derivative after the first step) then calculates the step to take for all parameters. It repeats this process until the solution (chi-square) does not improve compared to a tolerance. In scipy's leastsq
(often used by curve_fit
) the step size used to calculate numerical derivatives, and the tolerances for stopping the fit can be specified.
Typically, when using curve_fit
or leastsq
or least_squares
, you do not need to worry about any of these details about how the solution is found. Well, except that if you do have analytic derivatives it can improve the process in speed and accuracy. Your "model function" should just take the parameters passed in and calculate the predicted model for the data. Changing the values passed in will confuse the algorithm -- it's running your function with values it knows in order to try to find the solution.
Other questions:
If you don't know standard errors in your data (y
), don't worry about it too much. The absolute value of the report chi-square may not be close to ndata-nvarys
(as expected for a good fit), but the value relative to other fits should be fine.
One of the features of the LM method (at least in the MINPACK -> leastsq
-> curve_fit
implementation) is that it reports the final covariance matrix that can be used to determine uncertainties and correlations in the fitted parameters. Normally the reported values should increase chi-square by 1 (ie, 1-sigma
standard errors). Because not having good uncertainties on the data is so common, scipy curve_fit
scales this covariance to give values as if chi-square did equal ndata-nvarys
. That is, the standard errors in the parameters will be pretty good if you believe the fit is good (and sqrt(chi-square/(ndata-nvays))
can be used to estimate the error in the data).
curve_fit
docs? specifically under method? – kabanusleast_squares
. The Levenberg-Marquardt algorithm is common to use, but there are other options. – kabanus