2
votes

I have the following data points that I would like to curve fit:

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

t = np.array([15474.6, 15475.6, 15476.6, 15477.6, 15478.6, 15479.6, 15480.6,
              15481.6, 15482.6, 15483.6, 15484.6, 15485.6, 15486.6, 15487.6,
              15488.6, 15489.6, 15490.6, 15491.6, 15492.6, 15493.6, 15494.6,
              15495.6, 15496.6, 15497.6, 15498.6, 15499.6, 15500.6, 15501.6,
              15502.6, 15503.6, 15504.6, 15505.6, 15506.6, 15507.6, 15508.6,
              15509.6, 15510.6, 15511.6, 15512.6, 15513.6])

v = np.array([4.082, 4.133, 4.136, 4.138, 4.139, 4.14, 4.141, 4.142, 4.143,
              4.144, 4.144, 4.145, 4.145, 4.147, 4.146, 4.147, 4.148, 4.148,
              4.149, 4.149, 4.149, 4.15, 4.15, 4.15, 4.151, 4.151, 4.152,
              4.152, 4.152, 4.153, 4.153, 4.153, 4.153, 4.154, 4.154, 4.154,
              4.154, 4.154, 4.155, 4.155])

The exponential function that I want to fit to the data is:

function

The Python function representing the above formula and the associated curve fit with the data is detailed below:

def func(t, a, b, alpha):
    return a - b * np.exp(-alpha * t)


# scale vector to start at zero otherwise exponent is too large
t_scale = t - t[0]

# initial guess for curve fit coefficients
a0 = v[-1]
b0 = v[0]
alpha0 = 1/t_scale[-1]

# coefficients and curve fit for curve
popt4, pcov4 = curve_fit(func, t_scale, v, p0=(a0, b0, alpha0))

a, b, alpha = popt4
v_fit = func(t_scale, a, b, alpha)

ss_res = np.sum((v - v_fit) ** 2)       # residual sum of squares
ss_tot = np.sum((v - np.mean(v)) ** 2)  # total sum of squares
r2 = 1 - (ss_res / ss_tot)              # R squared fit, R^2

The data compared to the curve fit is plotted below. The parameters and R squared value are also provided.

figure

a0 = 4.1550   b0 = 4.0820   alpha0 = 0.0256
a = 4.1490    b = 0.0645    alpha = 0.9246
R² = 0.8473

Is it possible to get a better fit with the data using the approach outlined above or do I need to use a different form of the exponential equation?

I'm also not sure what to use for the initial values (a0, b0, alpha0). In the example, I chose points from the data but that may not be the best method. Any suggestions on what to use for the initial guess for the curve fit coefficients?

3
I would try 1. not fitting through the peak and 2. adding weights to some of the points that will drag the fit down. You can do with the sigma kwarg for curve_fit – dan_g
@user3014097 I was not aware of the sigma keyword. How do I apply it to my example? Also, starting the curve fit at the peak (t[1:] and v[1:]) works well but how much will that affect the popt values? – wigging
The equation plot is flat on the right side, while the data shows a clear upward slope. I do not think this one equation will fit that part of the data well. – James Phillips

3 Answers

5
votes

To me this looks like something that is better fit by multiple components, rather than a single exponential.

def func(t, a, b, c, d, e):
    return a*np.exp(-t/b) + c*np.exp(-t/d) + e


# scale vector to start at zero otherwise exponent is too large
t_scale = t - t[0]

# initial guess for curve fit coefficients
guess = [1, 1, 1, 1, 0]

# coefficients and curve fit for curve
popt, pcov = curve_fit(func, t_scale, v, p0=guess)

v_fit = func(t_scale, *popt)

enter image description here

3
votes

The best single 3-parameter equation I could find, R-squared = 0.9952, was an x-shifted power function:

y = pow((a + x), b) + Offset

with parameters:

a = -1.5474599569484271E+04
b =  6.3963649365056151E-03
Offset =  3.1303674911990789E+00

model

2
votes

If you remove the first data point, you will get a much better fit.

Using lmfit (https://lmfit.github.io/lmfit-py), which provides a higher-level and easier to use interface to curve-fitting, your fitting script would look like this:

import matplotlib.pyplot as plt
import numpy as np
from lmfit import Model

t = np.array([15474.6, 15475.6, 15476.6, 15477.6, 15478.6, 15479.6, 15480.6,
              15481.6, 15482.6, 15483.6, 15484.6, 15485.6, 15486.6, 15487.6,
              15488.6, 15489.6, 15490.6, 15491.6, 15492.6, 15493.6, 15494.6,
              15495.6, 15496.6, 15497.6, 15498.6, 15499.6, 15500.6, 15501.6,
              15502.6, 15503.6, 15504.6, 15505.6, 15506.6, 15507.6, 15508.6,
              15509.6, 15510.6, 15511.6, 15512.6, 15513.6])

v = np.array([4.082, 4.133, 4.136, 4.138, 4.139, 4.14, 4.141, 4.142, 4.143,
              4.144, 4.144, 4.145, 4.145, 4.147, 4.146, 4.147, 4.148, 4.148,
              4.149, 4.149, 4.149, 4.15, 4.15, 4.15, 4.151, 4.151, 4.152,
              4.152, 4.152, 4.153, 4.153, 4.153, 4.153, 4.154, 4.154, 4.154,
              4.154, 4.154, 4.155, 4.155])

def func(t, a, b, alpha):
    return a + b * np.exp(-alpha * t)

# remove first data point, take offset from t
tx = t[1:] - t[0]
vx = v[1:]

# turn your model function into a Model
amodel = Model(func)
# create parameters with initial values.  Note that parameters
# are named from the arguments of your model function.
params = amodel.make_params(a=v[0], b=0, alpha=1.0/(t[-1]-t[0]))

# fit the data to the model with the parameters
result = amodel.fit(vx, params, t=tx)

# print the fit statistics and resulting parameters
print(result.fit_report())

# plot data and fit
plt.plot(t, v, 'o', label='data')
plt.plot(t, result.eval(result.params, t=(t-t[0])), '--', label='fit')
plt.legend()
plt.show()

This will print out these results

[[Model]]
    Model(func)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 44
    # data points      = 39
    # variables        = 3
    chi-square         = 1.1389e-05
    reduced chi-square = 3.1635e-07
    Akaike info crit   = -580.811568
    Bayesian info crit = -575.820883
[[Variables]]
    a:      4.15668660 +/- 5.0662e-04 (0.01%) (init = 4.082)
    b:     -0.02312772 +/- 4.1930e-04 (1.81%) (init = 0)
    alpha:  0.06004740 +/- 0.00360126 (6.00%) (init = 0.02564103)
[[Correlations]] (unreported correlations are < 0.100)
    C(a, alpha) = -0.945
    C(a, b)     = -0.682
    C(b, alpha) =  0.465

and show this plot for the fit:

enter image description here