1
votes

The Google Cloud Natural Language API can be used to analyse text and return a syntactic parse tree with each word labeled with parts-of-speech tags.

Is there a way to deturmine if a noun is plural or not?

If Google Cloud NL is able to work out the lemma then perhaps the information is there but not returned through the API?

2

2 Answers

3
votes

Update

With the NL API's GA launch, the annotateText endpoint now returns a number key for each token indicating whether word is singular, plural, or dual. For the sentence "There are some cats here," the API returns the following token data for 'cats' (notice that number is PLURAL):

{
      "text": {
        "content": "cats",
        "beginOffset": -1
      },
      "partOfSpeech": {
        "tag": "NOUN",
        "aspect": "ASPECT_UNKNOWN",
        "case": "CASE_UNKNOWN",
        "form": "FORM_UNKNOWN",
        "gender": "GENDER_UNKNOWN",
        "mood": "MOOD_UNKNOWN",
        "number": "PLURAL",
        "person": "PERSON_UNKNOWN",
        "proper": "PROPER_UNKNOWN",
        "reciprocity": "RECIPROCITY_UNKNOWN",
        "tense": "TENSE_UNKNOWN",
        "voice": "VOICE_UNKNOWN"
      },
      "dependencyEdge": {
        "headTokenIndex": 1,
        "label": "DOBJ"
      },
      "lemma": "cat"
}

See the full documentation here.

1
votes

Thanks for trying out the NL API.

Right now there isn't a clean way to detect plurals other than to note that the base word is different than the lemma and guess whether it's plural (in English, perhaps it ends in an -s).

However, we plan to release a much better way of detecting morphological information like plurality, so stay tuned.