The Must Know Dialogflow concepts
There are 4 concepts you MUST know when learning Dialogflow, plus an optional 5th one which is very important but in some cases may not be mandatory.
The most important concept in Dialogflow is the concept of an intent.
What is an intent?
You create a Dialogflow agent by creating a list of “things that the user would like to do”.
Here is what the list of things a user would like to do looks like for the Prebuilt SmallTalk Agent
Now lets take a closer look at a single intent – the user wants the agent to answer a question. Here is how you declare the intent:
While the other sections are important, for now just focus on the section called “Training Phrases” and then the section called “Text Response”.
Here is the basic idea:
If the user says a phrase which is in the list of training phrases the agent will respond with one of the phrases in the text response section.
What is so special about this?
The special thing is – even if the user were to say a phrase which is a variant of what is already in the training phrase, Dialogflow can usually provide a response from the list of text responses.
Here is an example:
The phrase “I want an answer to my question” does not exactly match any of the training phrases. But Dialogflow understands that it is close enough. And accordingly, it will provide a response from the list of text responses.
So an intent is a way of specifying what the user wants to do (“intention”) and then providing an appropriate response.
We use Dialogflow entities to capture values.
Here is the official definition:
That might sound a bit too vague, but there is a reason for that. Entities can actually be used to capture a lot of things, as you will see in this article.
The story of the 4416-intent agent
As you might know, Dialogflow has a maximum of 2000 intents per agent.
So here is an email exchange I had with a reader recently (paraphrased).
Reader: How many intents can I have in an agent?
Reader: What if I need more?
Me: Why do you need more?
Reader: So I had this agent already built out, which will provide a response based on the user’s city for a handful of cities. I want it to give a suitable answer for every city on the planet.
As it happens, there are over 4000 cities on the planet, and the reader wanted to create an intent for each one.
While I didn’t find out what actually happened, here is what I believe happened:
This reader had likely created a chatbot using a non-AI platform such as Chatfuel. Since Chatfuel doesn’t have the concept of entities, you will need to do a lot of intent definitions to get such a chatbot to work.
A better way
Here is a better way to do it. Create a Dialogflow entity called CityList.
Now in your intent, when you use a city name from your list in the training phrase, it gets automatically annotated (that is, Dialogflow highlights the entity’s name and tells you that it could infer that it was a city belonging to the CityList you declared).
So what does this mean?
It means that you can capture all the city names you are interested in inside a single intent, and simply upload a list of cities into the entity called CityList.
An even better way
You might be thinking: the list of cities in the world is very well known, so why doesn’t Dialogflow do all this heavy lifting for us automatically? Why do I have to create a list of cities? Why can’t it just infer these values itself?
As it turns out, that’s exactly what Dialogflow does!
In that case, the entities are called system entities because the system (i.e. Dialogflow) already knows about them.
For example, let us see what happens when I type an entity not declared in CityList.
So when you type a value which isn’t in the CityList entity you can see that Dialogflow is still able to annotate it. But this time, the entity is @sys.geo-city and not @CityList as in the previous image.
The @sys prefix is an indication that it is a system entity.
As it turns out, Dialogflow can automatically infer most of the world’s cities*.
So the reader could have probably used just a single intent chatbot.
Types of entities
There are three types of entities in Dialogflow.
These are predefined entities – i.e. these are values Dialogflow can extract out of the box, with no assistance from the bot creator. They have the prefix @sys. Here is a list of system entities you can identify in Dialogflow:
- dates and times
- numbers (including flight numbers)
- amounts with units – currency, length, area etc.
- unit names
- geography – address, zip code, capital, country, city, state, airport etc
- contact – email and phone number
- music artist and music genre
The entities you declare – such as CityList – are called developer entities.
There is also the concept of “short lived” entities called user entities. A good example of this is the idea of a user’s custom playlist of songs. (Obviously, since it is end user specific, there isn’t a way to declare these as developer entities). This is a somewhat advanced concept so I won’t be going into it in too much detail here.
Hopefully, you can now see why the documentation defines entities as “powerful tools for extracting parameter values from natural language inputs”.
* This shouldn’t be generalized to all places – such as smaller towns. Dialogflow usually cannot infer those automatically from user input, especially for towns and smaller size locations outside US.
Here is the official definition:
While this is a good definition, it is possible to define contexts much more simply.
The Dory Bot
To understand contexts, you first need to understand what a Dory bot is.
An article on VentureBeat describes it as follows:
Contexts add memory to your chatbot conversation session
In the article snippet above, there is a reason Alexa can’t remember where “there” is. (Note: I don’t know if Alexa still has this issue, but you get the idea)
Chatbots are not human like in their ability to process what went before in the conversation. As a result, Alexa has no context for the word “there”.
But there is a way to simulate this memory of previous messages – and that is to use contexts in Dialogflow.
While I will not be going into any detail how this can be achieved, the important takeaway is that it can be done using Dialogflow contexts.
Note: At the same time it is also quite a complex feature to understand and implement. This means there are some limits on how well you can do this “remembering”.
In Dialogflow, contexts are used to add memory to a conversation session and avoid creating Dory bots.
Now we will take a look at the 4th must know concept – webhooks.
Here is the official definition:
Once again, there is a simpler way to understand what a webhook is.
Asking your chatbot to add two numbers
Let us define an intent for adding two numbers.
Now, in the response, we can get the two numbers that the user input, and define a response like this:
As you can see, we need to “fill in the blank” at the end of the response.
This is called fulfillment, because we are fulfilling the user’s request. To do this fulfillment, we use something called a webhook.
What is a webhook?
The webhook is some code (usually running on a server outside of Dialogflow) which performs your chatbot’s business logic.
In this case, since you want to add two numbers, the webhook will have some code which adds the two numbers. While this might seem like a lot of overhead to have a webhook just to do some simple math computation, that is how Dialogflow works.
The Dialogflow philosophy (as of this writing) is to offload all the business logic to your webhook.
Calling the webhook
First, make sure that you have switched on the “Enable webhook call for this intent” in the Fulfillment section at the bottom of the intent. (Note: you must do this for every intent which uses a webhook).
Webhook Data Flow
This is how data flows when you call a webhook.
1 User types a message to your agent
2 Your agent will map it to an intent
3 The result of this mapping will produce some JSON. To see this JSON, you can type the message into the Dialogflow test console without toggling the “Enable webhook call for this intent” and click on the Show JSON button at the bottom.
Clicking on the Show JSON button will let you see the JSON which will be sent to the webhook if the “Enable webhook” was toggled on.
4 The webhook receives this JSON as a POST request. The webhook code should parse this JSON data and extract relevant information.
In the JSON above, it will look at the “parameters” field and extract the two numbers to be added.
5 The webhook should return the result of its computation back to the intent as the response.
Note: This response should also be in a specific JSON format. And the response should be sent back inside 5 seconds. Don’t put long running computations into your webhook code.
6 The intent will extract the relevant fields from the response JSON and display the result to the user
Dialogflow webhooks are used to perform any and all business logic in your chatbot.
Actions are not as important as the previous four concepts, but it is quite likely you will run into them at some point as you are building a Dialogflow agent.
In this part of the “Must Know Concepts of Dialogflow” series, we will take a look at Dialogflow actions.
Here is the official definition:
This time, I do like the definition and it is quite clear and concise.
Here is my definition, from the perspective of webhooks:
Actions tell your webhook what business logic to execute
Let us consider the agent which added two numbers (from the webhooks example).
Suppose we want this agent to be a calculator which can perform the following four operations:
Now, to do these different computations, we will use different subroutines in our code. We will have four subroutines, one corresponding to each type of computation.
Add an action to the intent
In the intent we had defined previously for adding two numbers, we will add a suitable value in the Action field.
Inspect the JSON sent to your webhook
In the JSON which is sent to the webhook, the action field is set based on which intent was triggered.
For example, these are some sample JSON files sent over to your webhook for the different computations.
Parse the JSON
In your webhook you will look at the “action” field. Then you will invoke the corresponding subroutine – for e.g. if the action is “numbers.divide” you will invoke the code which computes the division.
You will need to pass two numbers to your subroutine for each of these computations, and you can get these by looking at the “parameters” field and parsing the values for “number” and “number1” respectively.
The action field in an intent is used to choose the action to perform. With respect to webhooks, the action field is used to tell the webhook exactly what business logic should be performed.