Am newer to NLP, Try to create NER model with help of spacy.io, I just Create my own NER Model for ORG entity https://spacy.io/usage/training#ner. Trained Data size was 100 and Trained data look like this.,
TRAIN_DATA = [
("2003 -2005 Pergo Inc. Software Analyst\Database Administrator", {"entities": [(11, 20, "ORG")]}),
("PROFESSIONAL EXPERIENCE Client: WPS Health Solutions, Madison, WI Mar17 - Till Date Role: RPA Developer", {"entities": [(32, 52, "ORG")]}),
("Client: National Institutes of Health (NIH/NIAMS), Bethesda, MD Jan15 - Feb17 Role: RPA Developer", {"entities": [(8, 36, "ORG")]}),
("Client: Wells Fargo, Fremont, CA July14 - Dec14 Role: .Net/SharePoint Developer", {"entities": [(8, 19, "ORG")]}),
]
Now I Test my sentence with my Trained Model. If am used trained data I got perfect company name.
doc = nlp('Client: Ananth Technologies Limited, Hyderabad, India Feb11- July12 Role: QA Automation Tester')
print("Organization", [(ent.text, ent.label_) for ent in doc.ents])
Organization [(u'Ananth Technologies Limited', u'ORG')]
but I passed new sentence it partially detect.
doc = nlp('Client: MOUNTAIN HIGH HOME BUILDERS, Loveland, CO Application Engineer 8/03-5/10')
print("Organization", [(ent.text, ent.label_) for ent in doc.ents])
Organization [(u'MOUNTAIN HIGH', u'ORG')]
Now I gradually increase my Trained data, accuracy increased at same time predict wrong word as ORG. My trained data(sentence) is look different with each like Date,Designation,location,etc..., in different places not in order you can see above(TRAIN_DATA). Now am Struck with here and My question is am in right way?
Can anyone please suggest me any idea to improve my model?
Thanks