I have some problem with adding own features to sklearn.linear_model.LogisticRegression. But anyway lets see some example code:
from sklearn.linear_model import LogisticRegression, LinearRegression
import numpy as np
#Numbers are class of tag
resultsNER = np.array([1,2,3,4,5])
#Acording to resultNER every row is another class so is another features
#but in this way every row have the same features
xNER = np.array([[1.,0.,0.,0.,-1.,1.],
[1.,0.,1.,0.,0.,1.],
[1.,1.,1.,1.,1.,1.],
[0.,0.,0.,0.,0.,0.],
[1.,1.,1.,0.,0.,0.]])
#Assing resultsNER to y
y = resultsNER
#Create LogReg
logit = LogisticRegression(C=1.0)
#Learn LogReg
logit.fit(xNER,y)
#Some test vector to check wich class will be predict
xPP = np.array([1.,1.,1.,0.,0.,1.])
#linear = LinearRegression()
#linear.fit(x, y)
print "expected: ", y
print "predicted:", logit.predict(xPP)
print "decision: ",logit.decision_function(xNER)
print logit.coef_
#print linear.predict(x)
print "params: ",logit.get_params(deep=True)
Code above is clear and easy. So I have some classes which I called 1,2,3,4,5(resultsNER) they are related to some classes like "data", "person", "organization" etc. So for each class I make custom features which return true or false, in this case one and zero numbers. Example: if token equals "(S|s)unday", it is data class. Mathematically it is clear. I have token for each class features I test it. Then I look which class have the max value of sum of features (that’s why return number not boolean) and pick it up. In other words I use argmax function. Of course in summarization each feature have alpha coefficients. In this case it is multiclass classification, so I need to know how to add multiclass features to sklearn.LogisticRegression.
I need two things, alphas coefficients and add my own features to Logistic Regression. The most important for me is how to add to sklearn.LogisticRegression
my own features functions for each class.
I know I can compute coefficients by gradient descent. But I think when I use fit(x,y) the LogisticRegression use some algorithm to compute coefficients witch I can get by attribute
.coef_
.
So in the end my main question is how to add custom features for different classes in my example classes 1,2,3,4,5 (resultNER).
y
. - Viktor Vojnovski