2
votes

Does somebody know if there is any libraries for named entity recognition that are language independent?

Thanks

1
An off-the-shelf tool that supports all of the world's languages does certainly not exist. Are you looking for something that you can train on your own data, or a tool that handles a bunch of languages? If the latter, which languages? - Fred Foo
I like to develop a tool for my own language Macedonian. I know that there can't be a library that is independent but maybe there is some that will be of little help for me:) - vikifor
You can take any of the good ones and retrain it on Macedonian tagged data. This is quite commonly done with Stanford's CRF-NER, and although the result will be far from perfect, it's often good enough as a baseline. - Fred Foo

1 Answers

2
votes

I doubt it.

In theory you can use pure supervised learning techniques if you have large annotated corpus. However, if you can't use language-dependent rules, heuristics or features, and you are looking for high precision and recall rates - the size of the corpus would have to be mammoth. I'll dare to say that there is probably not enough annotated data for any given human-spoken language for such task.