How do you usually get precision, recall and f-measure from a model created in Vowpal Wabbit on a classification problem?
Are there any available scripts or programs that are commonly used for this with vw's output?
To make a minimal example using the following data in playtennis.txt :
2 | sunny 85 85 false
2 | sunny 80 90 true
1 | overcast 83 78 false
1 | rain 70 96 false
1 | rain 68 80 false
2 | rain 65 70 true
1 | overcast 64 65 true
2 | sunny 72 95 false
1 | sunny 69 70 false
1 | rain 75 80 false
1 | sunny 75 70 true
1 | overcast 72 90 true
1 | overcast 81 75 false
2 | rain 71 80 true
I create the model with:
vw playtennis.txt --oaa 2 -f playtennis.model --loss_function logistic
Then, I get predictions and raw predictions of the trained model on the training data itself with:
vw -t -i playtennis.model playtennis.txt -p playtennis.predict -r playtennis.rawp
Going from here, what scripts or programs do you usually use to get precision, recall and f-measure, given training data playtennis.txt
and the predictions on the training data in playtennis.predict
?
Also, if this where a multi-label classification problem (each instance can have more than 1 target label, which vw can also handle), would your proposed scripts or programs capable to process these?