As a general rule of thumb, never use training data for testing: It is not a challenge for an NLP engine to correctly predict the intent for a user example it already knows. The purpose of all the NLP training is to finally make predictions for user examples that it has never seen before.
That’s why it is recommended to always strictly separate the data you use for training your NLU engine from the data you use for testing.
Annotate Existing Test Cases with NLP Asserters
If you are already using Botium for conversational flow testing, you can annotate the BotiumScript test case with NLP asserters so Botium knows the expected outcome and can compare with the predictions.
Here you have an example test case from Botium, involving a chatbot from the tourism domain:
In BotiumScript, this test case looks like:
T01_Travel_Berlin_Vienna #me I want to travel from Berlin to Vienna. #bot Im happy to hear it. And where are you now? #me in Munich #bot So you are in Munich, and want to travel from Berlin to Vienna?
You can annotate the expected NLP intent by editing the bot conversation step and adding the NLP Intent Asserter and the NLP Intent Confidence
The annotated test case then looks like this:
And again, in BotiumScript:
T01_Travel_Berlin_Vienna #me I want to travel from Berlin to Vienna. #bot Im happy to hear it. And where are you now? INTENT travel #me in Münich. #bot So you are in Münich, and want to travel from Berlin to Vienna? INTENT travel ENTITY_VALUES Berlin|Vienna|Münich
The benefits of annotating existing conversational test cases is that you can re-use existing test data. The drawback is that the analytic results will be distorted if you have multi-step conversations: A Botium test case will exit as soon as the first asserter fails, all following conversation steps are ignored.
Recommended: Add User Examples to Utterances Lists
Botium works best for simple question/answer conversations: a question (“user example”) is sent to the NLP engine, and the response is processed by Botium. Register a new test set in Botium and add a new utterance list for each NLP intent you want to be resolved - name it exactly like the intent. Then add user examples you want to test.
In BotiumScript this is a flat text file:
travel I want to travel from Berlin to Vienna go to vienna, from berlin book a flight from berlin book a ticket to vienna
As a final step, you have to tell Botium that this test set is only for question/answer conversations. In the Configuration menu, click Scripting and enable the Expand Utterances to Conversations as well as the Use Utterance Name as NLU Intent options.
Pro Tip: Use Paraphraser to Quickly Generate New User Examples
Botium includes a paraphraser to quickly generate new user examples based on given ones. After adding a handful of user examples to the utterance list, click on the Paraphrase it! button to get a couple of suggestions for additional user examples and select the ones you want to use.
The result is a large number of similar user examples to use for testing
Keep Test Dataset in Git Repository
Instead of adding your test datasets to the internal Botium repository, you can (and should!) use a Git repository for you test data and establish a process for continuous improvements - see Best Practice: Test Case Development.
Download Training Data into Botium (optional)
Besides adding the test dataset to Botium, you should also add the training dataset - if not already done with one of the previous steps. This way you can use the tools included in Botium for data augmentation, such as the paraphraser, the translator and the humanificator.
Use the Test Case Wizard to download the training data from your NLP engine into Botium