I am using Stanford NER 3.6.0 to identify names of person. I've no problem in generating an XML from either a input text file or a input XML file.
I am facing problem in reading the XML file returned by NER.
The two issues I am facing now are: 1. Name cannot begin with the ' ' character, hexadecimal value 0xA0.
- Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Im generating the XML output using JAR file and Command prompt.
Command line:
java -mx1000m -cp "D:/Downloads/Projects/Installations/stanford-ner-2015-12-09/stanford-ner.jar;D:/Downloads/Projects/Installations/stanford-ner-2015-12-09/lib/*" edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier "D:/Downloads/Projects/Installations/stanford-ner-2015-12-09/classifiers/english.conll.4class.distsim.crf.ser.gz" -outputFormat inlineXML -textFile "C:\Users\Freeware Sys\AppData\Local\Temp\References (2)_in.txt" > "C:\Users\Freeware Sys\AppData\Local\Temp\References (2)_ner.xml" -inputEncoding "UTF-8" -outputEncoding "UTF-8"
Any help would be much appreciated.
Thanks.