0
votes

So I have currently trained a Multinomial Naive Bayes classifier, using [SKiLearn][1] Now what I can do is classify test data by using predict.

But if I want to run this every night, as a script, I clearly need to always have a classifier already trained up! Now what I'd like to be able to do, is take classifier coefficients, informative words, and use these to classify new data.

Is this possible - to develop my own method for classification? Or should I be simply training the SkiLearn classifier nightly?

EDIT: One thing, it seems I can do, is retain and save my trained classifier.

However with logistic regression, you can take the coefficients and use these on new data. Is there anything similar to this for NB?

1
I don't understand why you wouldn't just want to save and load the classifier as necessary.aplassard
What do you mean by "take the coefficients and use these on new data"? All ML algorithms are designed to be trained and then used to predict outputs on new, unseen data.Artem Sobolev

1 Answers

1
votes

Do you mean [sklearn]? Are you using python? If that is the case, it turns out that [sklearn] provides a function for getting the parameters of the model [get_params(deep=True)] as well as a function for setting them [set_params(**params)].

Therefore, a possible procedure could be:

Training stage:

1) Train the model

2) Get the parameters of the model by using get_params()

3) Save the parameters into a binary file (e.g. by using pickle.dump())

Prediction stage:

1) Load the parameters of the model from the binary file (e.g. by using pickle.load())

2) Set the parameters of the model by using set_params()

3) Classify new data by using the predict() function

Hope that helps.