Mismatch Probability Risks
This section shows some charts visualizing the risk that some intents will be mismatched - meaning that the NLU engine predicts the correct intent, but with a confidence score very close to another one. In real-life, a chatbot in this situation often responds with something like I am not sure what you mean - do you mean X or Y ? (In IBM Watson, this is called disambiguation).
Clicking in the radar chart shows the list of intents with the confidence score predicted by the NLU engine - this only works if the NLU engine actually returns an alternate intents list.
There are also charts showing the similarity of two intents based on the alternate intents lists returned by all user examples.
Read here to know more: Intent Mismatch Probability
Confidence Deviation Risks
The confidence deviation is a measure for the bandwidth of the predicted confidence score for all the user examples of an intent. It is calculated as standard deviation of the confidence scores.
Read here to know more: Intent Confidence Deviation
This histogram shows the amount of utterances per predicted intent.
Read here to know more: Intent Utterance Distribution
You can download the test results as CSV, JSON and Excel for further processing. The list contains:
predicted intent and confidence score
extracted entities and confidence score
expected intent and entities
Please sign in to leave a comment.