This chart shows the average intent confidence score of all utterances in the test session, grouped by ranges.
Alerts
The lower the confidence the higher the failure probability. Depending on the NLU engine in question, a confidence score of 0.4 to 0.6 is the minimum the test set should show. Utterances with low confidence score are either likely to have no recognized intent at all, or are classified as unexpected intent. Everything below the confidence score threshold should be investigated.
Actions
-
An utterance is too generic to be resolved:
-
Check if the utterance is a valid test case at all, and if it actually should resolve to a specific intent
-
Train your NLU engine with additional variations for this utterance for a specific intent, and remove variations for this utterance from other training data
-
-
An utterance is too specific to be resolved:
-
Train your NLU engine with additional variations for this utterance for a specific intent
-