0
votes

I have a script that replaces a word with a synonym using NLTK and WordNet. As far as I can tell, the most effective way to find a synonym by lemmatizing, but that removes conjugation from the process.

For example, say I want to replace "bored" with "drilled"...

word = 'bored'
syns = []
wordNetSynset =  wn.synsets(word)
for synSet in wordNetSynset:
     for w in synSet.lemma_names():
        syns.append(w)

set(syns)

Output:

{'blase', 'bore', 'bored', 'drill', 'tire', 'world-weary'}

I can use some POS filtering to make sure I only return verbs, but they won't be conjugated appropriately. I can get "bore", "drill" and "tire" ... how do I get "bored", "drilled" and "tired"? Or, if I do nouns, what if I want "bores", "drills" or "tires"?

(I will be going over these manually, so meaning is not an issue right now.)

1
I'm not sure there's an easy way with NLTK to do this. You could use Lemminflect (my project) and have it inflect the synonym, based the Penn-Treebank tag of the original word. - bivouac0

1 Answers

0
votes

This is a task for surface realisation. After lemmatiziation and finding the appropriate synonym, you can inflect the lemma using e.g. simpleNLG or another surface realiser of your choice. What you need to do is check the inflection type (e.g. 3rd person past) of the original word and restore on the synonym using the functions of the surface realisation module.