scikit learn prediction from coef_

Question

I am trying to generate prediction from fitted model (using scikit-learn, a simple linear regression using MultiTaskLasso). I assume coef_ stores the weight of feature. Suppose there are 5 labels and 200 features, it should be 5*200 in 2D. What I did is: (in python) prediction = np.dot(X_test,coef_.T) + intercept_ . But it seems there is something wrong. When I switch to using scikit-learn's function predict(X_test), the result is right. Can anyone tell me what I did wrong?

The difference is only this step, when I use predict, it is right; when I use my code, it's wrong.

Can you provide an error traceback? What you are doing is correct if you are using dense matrices. With sparse matrices, this may need to be adapted. — eickenberg
Thanks. I did some modification and it works now. However, I am not sure what wrong previously. I guess it is related to numpy 1D array (it is always a line array instead of column). Can you specify sparse matrix? Any link I can read? — Kenny

eickenberg eickenberg · Accepted Answer · 2014-10-12T15:37:07

If predict works, then sklearn.linear_model.decision_function works:

def decision_function(self, X):
    """Decision function of the linear model.

    Parameters
    ----------
    X : {array-like, sparse matrix}, shape = (n_samples, n_features)
        Samples.

    Returns
    -------
    C : array, shape = (n_samples,)
        Returns predicted values.
    """
    X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
    return safe_sparse_dot(X, self.coef_.T,
                           dense_output=True) + self.intercept_

It does the same thing you propose but handles sparse matrices gracefully. If none of your matrices are sparse, then you should check X_test again.

scikit learn prediction from coef_

1 Answers