7
votes

I have some data points and would like to find a fitting function, I guess a cumulative Gaussian sigmoid function would fit, but I don't really know how to realize that.

This is what I have right now:

import numpy as np
import pylab
from scipy.optimize import curve_fit

def sigmoid(x, a, b):
     y = 1 / (1 + np.exp(-b*(x-a)))
     return y

xdata = np.array([400, 600, 800, 1000, 1200, 1400, 1600])
ydata = np.array([0, 0, 0.13, 0.35, 0.75, 0.89, 0.91])
         
popt, pcov = curve_fit(sigmoid, xdata, ydata)
print(popt)

x = np.linspace(-1, 2000, 50)
y = sigmoid(x, *popt)

pylab.plot(xdata, ydata, 'o', label='data')
pylab.plot(x,y, label='fit')
pylab.ylim(0, 1.05)
pylab.legend(loc='best')
pylab.show()

But I get the following warning:

.../scipy/optimize/minpack.py:779: OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)

Can anyone help? I'm also open for any other possibilities to do it! I just need a curve fit in any way to this data.

2
If it might be of some use, I got an excellent fit to all data points using a scaled Weibull cumulative distribution "y = Scale * (1.0 - exp(-(x/b)^a))" with R-Squared = 0.9978 and RMSE = 0.01423 using the parameters a = 6.4852359831229389E+00, b = 1.1063972566493285E+03, and Scale = 9.0659231615116531E-01James Phillips
The link for the scipy documentation of this distribution with associated fitting details was deleted from my comment, so I am unable to assist in using scipy to fit your data - which is how I derived those parameter values. I do not know how you can reproduce the fitting results I posted without the deleted link.James Phillips

2 Answers

9
votes

You could set some reasonable bounds for parameters, for example, doing

def fsigmoid(x, a, b):
    return 1.0 / (1.0 + np.exp(-a*(x-b)))

popt, pcov = curve_fit(fsigmoid, xdata, ydata, method='dogbox', bounds=([0., 600.],[0.01, 1200.]))

I've got output

[7.27380294e-03 1.07431197e+03]

and curve looks like

enter image description here

First point at (400,0) was removed as useless. You could add it, though result won't change much...

UPDATE

Note, that bounds are set as ([low_a,low_b],[high_a,high_b]), so I asked for scale to be within [0...0.01] and location to be within [600...1200]

5
votes

You may have noticed the resulting fit is completely incorrect. Try passing some decent initial parameters to curve_fit, with the p0 argument:

popt, pcov = curve_fit(sigmoid, xdata, ydata, p0=[1000, 0.001])

should give a much better fit, and probably no warning either.

(The default starting parameters are [1, 1]; that is too far from the actual parameters to obtain a good fit.)