In a recent coaching call, a client of mine had some difficulty capturing names using the @sys.given-name entity. In fact, the name which he tried and which failed, was mine 🙂
He replaced the name with @sys.any and was able to capture the name. So he got curious: why not just use @sys.any everywhere? It is not a good idea, and this article will describe why.
Capturing non-English names
First of all, I believe DialogFlow will soon become smart enough to capture a whole lot more names simply because they now have access to Google's data and expertise in this area.
But at the moment, it is very challenging.
However, using the @sys.any wildcard entity comes with a set of challenges and you should learn about those challenges before using it willy nilly.
Putting only the @sys.any in your userSays
So when DialogFlow prompts for a name, most people are not going to say "My name is Aravind". They will simply say, "Aravind". You might think it is a good idea to just use @sys.any alone in the userSays. You will run into two possible issues:
Sometimes users will say a question back. "Do you mean my first name?" or "I don't wish to give you that information" or some thing like that. Now since you used the @sys.any you wouldn't really have any way to capture this information, and you might instead reply with "Hi $name. Nice to meet you". As you can imagine, starting off a conversation with "Hi I don't wish to give you that information. Nice to meet you" won't make your chatbot look particularly smart.
There is a second challenge. Suppose you didn't have an input context for the intent which captures the name, that means every intent which has no input context (i.e. probably most of your intents) is now a candidate for selection. When there are multiple candidates for selection, the @sys.any intent has to work extra hard to figure out whether it is the actual winner. And from experience, I can tell you that sometimes it fails.
Parsing the @sys.any on your webhook
So you might choose to get the full string in the @sys.any and see if it matches any names in your database in your webhook code. When you do this, you will run into a new set of challenges.
Incomplete name database
As you do this, you might simply not have the name even in your database (unless your database is really exhaustive).
Typos in names
You might have the name in your database, but the user might have made a typo. This means the name is not going to match.
Nearby and almost matches
You can, finally, try to handle typos by using some type of program which can do "nearby" and "almost" matches, such as the Levenshtein distance. First, it is not that straight-forward to implement these algorithms (especially if you customize the scoring rules to handle special cases).
But more importantly, by the time you reach this point, you are almost replicating the functionality that DialogFlow is doing for you. So here is a rule of thumb:
Don't build a mini-DialogFlow in your code in the name of improving accuracy
So what is the alternative?
Here is my suggested approach:
- For the intent which prompts for the user name, be sure to add an output context (this makes sure there is a context for the intent which gets the name, and reduces the selection candidate list)
- Create a name entity which can capture as many names as possible. Use the approach shown in the video below to extend the @sys.given-name entity. Let us call the entity @namelist.
- Create one intent which uses the @namelist entity
- Create a fallback intent with the same context, and simply respond with "I couldn't get that name. Can I just call you Guest?" or some variant.
- Important: Later, analyze your Training/History tab and look for this particular fallback intent. Now copy the user's name and add it to @namelist.
Obviously, if you are capturing the name so you can do a lookup, this isn't going to work very well. Also, there are probably going to be some scenarios where calling the user by their name is extremely important.
For scenarios where you wish to use it just to address the user, this approach is probably the most reasonable tradeoff between accuracy, development time and smooth conversation flow.