3
votes

I have a model equation let's call it eq_m:

eq_m

that I know my dataset follows, I am trying to fit my data to eq_m so that I can use the fitted parameters to predict new data.

However this eq_m is non linear hence I used scipy's curve_fit to get the lambda, mu, sigma parameter values, using the following snippet:

opt_parms, parm_cov = o.curve_fit(eq_m, x, y,maxfev=50000)
lamb , mu, sigm = opt_parms

I run this model on various groups of data which are all supposed to follow this model, and 55/60 give me great results however the 5 remaining groups are highly over fitted and have predicted parameters with high positive values. Is there a way I can regularize the curve fit and penalize high magnitude parameter values using scipy/numpy or scikit-learn?

My supervisor suggested to use conjugate priors but I have no idea how to do that here.

Can anyone please help me with this? If I have to provide a guess to solve this problem can someone please also tell me how to calculate these guesses?

1

1 Answers

7
votes

curve_fit does not support regularization. It always uses a least squares cost function. In order to regularize the fit you need to write a custom cost function that you minimize with scipy.optimize.minimize.

Let's first translate the curve fitting into a minimization problem:

def eq_m(x, lamb, mu, sigm):  # assumed signature of eq_m
    pass

def cost(params):  # simply use globally defined x and y
    lamb, mu, sigm = params
    model = eq_m(x, lamb, mu, sigm)
    return np.mean((model - y)**2)  # quadratic cost function

p0 = [1, 0, 1]  # initial guess for lambda, mu, and sigma
res = o.minimize(cost, p0)
print(res)  # see if minimization succeeded.
lamb, mu, sigm = res.x

This will hopefully give you similar results as curve_fit. (If this is not the case it's time to start debugging.)

Now we can play with the cost function to implement regularization:

def cost(params):
    lamb, mu, sigm = params
    model = eq_m(x, lamb, mu, sigm)
    reg = lamb**2 + mu**2 + sigm**2  # very simple: higher parameters -> higher cost
    regweight = 1.0  # determines relative importance of regularization vs goodness of fit
    return np.mean((model - y)**2)  + reg * regweight

There is no strict need to penalize the parameters quadratically. Basically you can do anything, just make sure that large parameters increase the cost. Results will vary :-)

All this is a very ad-hoc approach that a lacks a rigorous theoretical basis. The supervisor's suggestion to use conjugate priors sounds like they expect you to use Bayesian estimation techniques. Although certain priors can be considered equivalent to Regularization, the approach is altogether different and can be mathematically rather involved. Instead of a cost function you would need to define a likelihood function, define priors on the parameters, and combine them using Bayes' rule to get the posterior likelihood, which you finally maximize.