0
votes

I have an intent with specific utterances like "what were you dreaming when you slept yesterday". The entities labeled there are what, dreaming, slept, and yesterday.

However, even when I test the exact phrase, while it gives full confidence to the intent, it does not pull out all the entities, just the "what" part.

The entities are correctly labeled, so I don't understand why it can't extract them. Is there some best practice that I am missing here?

Edit:

I should add that I have several thousand utterances split between just a handful of intents, since the domain has a lot of overlap. I wonder if that could be an issue -- an utterance that is simpler (fewer entities) interfering with an utterance that is more complex (more entities) in the same intent?

Edit 2:

I've used the LUIS API to set up all of the utterances, and the website to train and test them.

I am using just a few intents with simple meanings like "fact", "opinion", and "explain".

There are only 15 entities, which are simple groups like "emotion" (hurt, love, fear), "individual" (father, mother, sister), and "food" (hamburger, sandwich).

In each of those intents, there are a couple of thousand sample utterances.

Here is an image of some utterances in their raw form from the LUIS website: utterances raw

Here is an image of the same utterances using the "Entities View" from the LUIS website: enter image description here

As you can see, the entities are somewhat complex in thier positioning in the utterances. Likewise, since there are so many utterances, I don't know if LUIS will try and pick the simplest one and ignore the more complex ones (perhaps that could be why it isn't extracting all of the entities?)

Ignore that one of the entities is named "intent", it was just a convenient name for that entity.

A direct example that shows using one of these exact utterances. It gets the intent right, but only extracts a single entity: enter image description here

2
Do you train your app after you labeled entities in your example utterance?Fei Han
Yes, it is trained.g01d
@user9862995 could you give more details on how you're creating your entities so that we can try to repro your issue and in turn work towards a solution?Zeryth
@Zeryth I have added some more info to the question, but I'm not sure what you mean by how I am creating the entities. I've tried to answer as best as I could.g01d
@user9862995 thank you. And to clarify, I meant to ask were you creating just simple entities? List Entity? Regex? etc.Zeryth

2 Answers

0
votes

@user9862995

I was able to extract all entities by doing the following:

  1. Created “opinion” intent
  2. Added the 4 utterances that show up in your screen shot
  3. Labeled the entities in the utterances—all are of “simple” entity type
  4. Train
  5. Testing I use the utterance “you want to ride shotgun in this old truck or not”

Results: all 4 entities in the utterance are extracted

enter image description here

So given the information that you've provided to me, the only difference that I can see right now is the number of utterances you've added. It would be difficult to identify exactly why the entities are under-detected in your case (it could be for example a word has been labeled as multiple entities, thereby confusing LUIS or other causes).

Would you be able to share the JSON of your model?

If not, then the best answer given the information at hand would be to follow the best practices outlined in the documentation, specifically with building iteratively, and testing, rather than adding thousands of utterances at once--it could be potentially several conflicting rules that are causing the cocktail behavior that you see output!

Also, be sure to visit the "Review endpoint utterances" section of your LUIS app and take advantage of LUIS's active learning to improve quality of the results you receive. With active learning,

LUIS examines all the endpoint utterances and selects utterances that it is unsure of. These utterances you can label/train/publish to identify the utterances more accurately.

0
votes

MS LUIS... works like below in sequential order...hope this may give you some ideas.

  • First, it will recognize all applicable entities from user input
  • Second, it will filter out all intents but retain the intents which have matching entities
  • Thirdly, it will further filter out the intents based on other matching words (not entities) from the user input
  • Finally, it will assign the confidence level to intents and get you the output

Too many utterances may cause the problems you are facing. Use the patterns in line with intents that will decrease number of utterances required with an improved accuracy. need to be a bit of creative here.