Is there a way in Python 2.7 using NLTK
to just get the word and not the extra formatting that includes "synset"
and the parentheses and the "n.01"
etc?
For instance if I do
wn.synsets('dog')
My results look like:
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
How can I instead get a list like this?
dog
frump
cad
frank
pawl
andiron
chase
Is there a way to do this using NLTK
or do I have to use regular expressions
? Can I use regular expressions
within a python script?
includes "synset" and the parentheses and the "n.01"
frank
andchase
should not be part of the desired output? - Brajfrank
is a synonym/shorthand forfrankfurter
which is a synonym forhot dog
ordog
. Similarlydog
as a verb means tochase
. - aelfric5578