Using Wordnet Synsets from Python for Italian Language

Question

I'm starting to program with NLTK in Python for Natural Italian Language processing. I've seen some simple examples of the WordNet Library that has a nice set of SynSet that permits you to navigate from a word (for example: "dog") to his synonyms and his antonyms, his hyponyms and hypernyms and so on...

My question is: If I start with an italian word (for example:"cane" - that means "dog") is there a way to navigate between synonyms, antonyms, hyponyms... for the italian word as you do for the english one? Or... There is an Equivalent to WordNet for the Italian Language ?

Thanks in advance

alexis alexis · Accepted Answer · 2017-05-11T13:44:53

You are in luck. The nltk provides an interface to the Open Multilingual Wordnet, which does indeed include Italian among the languages it describes. Just add an argument specifying the desired language to the usual wordnet functions, e.g.:

>>> cane_lemmas = wn.lemmas("cane", lang="ita")
>>> print(cane_lemmas)
[Lemma('dog.n.01.cane'), Lemma('cramp.n.02.cane'), Lemma('hammer.n.01.cane'),
 Lemma('bad_person.n.01.cane'), Lemma('incompetent.n.01.cane')]

The synsets have English names, because they are integrated with the English wordnet. But you can navigate the web of meanings and extract the Italian lemmas for any synset you want:

>>> hypernyms = cane_lemmas[0].synset().hypernyms()
>>> print(hypernyms)
[Synset('canine.n.02'), Synset('domestic_animal.n.01')]
>>> print(hypernyms[1].lemmas(lang="ita"))
[Lemma('domestic_animal.n.01.animale_addomesticato'), 
 Lemma('domestic_animal.n.01.animale_domestico')]

Or since you mentioned "cattiva_persona" in the comments:

>>> wn.lemmas("bad_person")[0].synset().lemmas(lang="ita")
[Lemma('bad_person.n.01.cane'), Lemma('bad_person.n.01.cattivo')]

I went from the English lemma to the language-independent synset to the Italian lemmas.

Using Wordnet Synsets from Python for Italian Language

2 Answers