I know Apache OpenNLP uses MaxEnt model for its NER tagger. But what features Apache OpenNLP does use (by default) while running its named entity recognition (NER) models? and also how can we incorporate/customize new features in OpenNLP (Java implementation)?
1 Answers
1
votes
In Apache OpenNLP NER, it allows users to define features via XML file. The default XML is this:
If you want to customize it, use -featuregen
option when you train the model:
$ opennlp TokenNameFinderTrainer -featuregen your-features-definition.xml -model my-model.bin ...
You don't need to specify your customized feature XML file when you execute TokenNameFinder as the model file includes the information of your features.