0
votes

I want to compare two words with similarity score. I used wordnet from nltk.corpus.

from nltk.corpus import wordnet
nltk.download('wordnet')    
w1 = wordnet.synset("price" '.n.01')  #wordnet.lemmas(i2)[0]
w2 = wordnet.synset("amount" + '.n.01') 
print(w1.wup_similarity(w2))

I got similarity score, but, it works with only between noun, but, what I need is to compare noun between adjective or other type of word.

For example bellow , I need to compare word like "expensive" (adjective) with "price".

I want preferably a library with pre-entrained model because I need a model that can work with any words in any domain

What about word embedding ?

1

1 Answers

1
votes

I think you could try to find the word similarity with GloVE pre-trained embeddings. It is rich in information and trained on the entire wikipedia corpus. However, your words have to be limited to the vocabulary of it (i.e. only the words for which it has been trained), although that is pretty large and will cover almost every significant english word I believe. For Glove embeddings, a cosine similarity between two word vectors give a good measure of how close they are in sense or meaning. i.e.

from scipy import spatial

dataSetI = [3, 45, 7, 2]
dataSetII = [2, 54, 13, 15]
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)

If you wish to use the word2vec word vectors, you can use spacy as well.