This is the step where most of the effort is involved: the machine learning algorithms involved in NLP - and most of the state-of-the-art NLP engines out there are based on some kind of machine learning - are only as good as the data they have been trained on. It is both a question of quality as well as quantity.
Botium has tools to support in gathering and augmenting datasets for training and testing.
Note: Although from a technical perspective, it doesn’t make a lot of
sense to use training data for testing, this is usually the first step in Botium
Coach: - It can be done with a few clicks in Botium Box - It will give you first
insights how the NLP engine is performing on the data it has been trained on - It shows up
any flaws within the training data itself
**Don’t underestimate the importance of clean training data for the real-life-performance of your NLP engine!**
-
Option 1: Use Training Data for Testing
Instead of annotating the test cases manually, Botium Box includes a Test Data Wizard to download the conversation model of an NLP provider and convert it to BotiumScript test cases. They can be used instantly by Botium Coach. -
Option 2: Use Included Botium Datasets
Botium Box comes with batteries included - out-of-the-box there are datasets available in Botium Box you can use for testing and training your NLP engine. -
Option 3 - The Ideal Scenario: Bring your own data
-
Advanced Challenges
Parent topic: How to do NLU/NLP Testing with Botium Coach
Comments
0 comments
Please sign in to leave a comment.