0
votes

I have followed this SpaCy tutorial for training a custom dataset. My dataset is a gazetteer. Therefore, I made my training data as the following.

TRAIN_DATA = [
("Where is Abbess",{"entities":[(9, 15,"GPE")]}),
("Where is Abbey Pass",{"entities":[(9, 19,"LOC")]}),
("Where is Abbot",{"entities":[(9, 14,"GPE")]}),
("Where is Abners Head",{"entities":[(9, 29,"LOC")]}),
("Where is Acheron Flat",{"entities":[(9, 21,"LOC")]}),
("Where is Acheron River",{"entities":[(9, 22,"LOC")]})
]

I used 'en_core_web_sm' for the training, not a blank model.

model = 'en_core_web_sm'
output_dir=Path(path)
n_iter=20

After training for 20 epocs, I tried to make a prediction with the trained model. The following is the output that I get.

test_text = "Seven people, including teenagers, have been taken to hospital after their car crashed in the mid-Canterbury town of Rakaia."

Seven people, including teenagers 0 33 GPE
the mid-Canterbury town of Rakaia.. 90 125 GPE

I did a prediction using 'en_core_web_sm' for the same test_text. The output is the following.

Seven 0 5 CARDINAL
mid-Canterbury 94 108 DATE
Rakaia 117 123 GPE

Can someone please instruct me on the errors that I am making while training SpaCy?

1

1 Answers

0
votes

The reason for the poor results is due to a concept called catastrophic forgetting. You can get more information here.

tl;dr

As you are training your en_core_web_sm model with new entities, it is forgetting what it previously learnt.

In order to make sure that the old learnings are not forgotten, you need to feed the model examples of the other types of entities too during retraining. By doing this, you will ensure that the model does not self tune and skew itself to predict everything as the new entity being trained.

You can read about possible solutions that can be implemented here