You probably know that Dialogflow doesn't have inherent support for entities which can be expressed as regular expressions. It is possible they might come up with support for regex based entity extraction at some point in the future, but until then, you can use the following idea.
First, before we go into the idea, the suggested solution is to simply use the wildcard entity (@sys.any) in your webhook after the intent fires. Now, needless to say, this isn't an ideal situation for a lot of reasons. For example, let us simply consider what the article itself suggests:
What do these recommendations imply?
1 A higher number of examples are helpful when you use the wildcard entity in an intent. This is because the wildcard entity, by its very nature, aggressively captures any and all input.
2 Using the entire training phrase is not a good idea (since Dialogflow doesn't have anything else to go from) which means if the only thing the user types is the entity value then it can be very challenging
3 Prompting the user for confirmation is obviously a strategy for playing it safe, but you do end up potentially annoying the user
4 Restricting the intent with an explicit input context is helpful to guard the intent, but what if the expected behavior is that the user does type in the entity value in their first message to the bot?
You need to create a custom integration
This answer only works if you don't use a custom integration. Perhaps that's not what you wanted or expected, but this is a prerequisite. Besides, by using a custom integration, you also get a host of other benefits.
If you can create a custom integration, this is how you would do it.
1 Create a composite entity corresponding to the regex entity you want
There are different ways to do this, but you can basically take the example values and create a composite entity based around it. Here is the key: make sure you use a hyphen as a separator for the different pieces in your regex.
An example: I recently found out about the Swedish national ID, which has a specific regex format.
For example, here are a couple of numbers given in the example:
Here is the key - break the entity down into as many logical pieces as possible.
For example, for above - we have YY - MM - DD - dddd
where the last 4 digits could be any combination. (Of course, it is limited by the checksum value for a given person, but in theory those 4 digits could take any combination).
Here is how I define the entities:
First, the Year
Next, the month:
Then, the day of month:
The four random digits at the end:
Now we can define the personnummer composite entity:
2 Create an intent with potential user phrases and test if the composite entity works
This is an important step. You want to make sure your composite entity, as defined, is annotated correctly by Dialogflow.
For example, here is my intent
3 Preprocess the input
Now you might be saying: "But Aravind, the whole problem is that no user is going to type the value in the format I have defined in Dialogflow".
This is where the fun begins. 🙂
You should preprocess the input to change it into the format you have defined in your Dialogflow composite entity.
For e.g. say the user types in
My personnummer is 8112289874
You should change it to
My personnummer is 81 - 12 - 28 - 9874
before sending the phrase to Dialogflow when you call the detectIntent API method.
Most programming languages will allow you to do something like this using some kind of regex-replace function.
For e.g. here is an example regex based replacement function for the personnummer in PHP:
And this is what the original and modified text strings will look like with the preg_replace method above (screenshot from my console output):
4 Use the preprocessed input in the detectIntent API call
Now you will send the modified text to Dialogflow. That is, you will call Dialogflow's detectIntent API method and the input query will be the modified text.
When you do that, you will not only be able to get the right intent matched, but you will also enable very good entity extraction.
I modified my quickstart code and here is a console screenshot of the output I got from using this technique:
Note: the fact that you are seeing the fulfillment text output the number shows that Dialogflow was able to extract the correct entity.
At the same time, the exact format of the output (and why it doesn't precisely match the input) depends on how Dialogflow chooses to print an entity value when using the format $entity. For reference, here is what I have in the Text response section for this intent.
In a future article, I will explain how this technique is better than the recommended solution mentioned at the top of this article.
I am planning to create a course on creating middleware which can improve your Dialogflow agent's abilities. Here is a quick list of things you can do when you build your own custom middleware.
In the meantime, if you need help with this or similar questions, you can get in touch with me here.
- Using Collect.chat for preNLU bots
- Reader Question: How to get some sample training data for Dialogflow?
- Getting the top 3 (or top N) intents in Dialogflow: An experiment
- Dialogflow Regexp (regular expression) entity
- Using Dialogflow for educational bots
- Chatbot makes $3 million in sales in 1 hour
- Reader Question: Intent based FAQ bots vs knowledge based FAQ bots
- Should you use Dialogflow from scratch?
- 3 ways Airtable can speed up your Dialogflow prototype development
- Automatically generating a rich Dialogflow website chatbot