After our talk about challenges in building NLP based chatbots at the Bilbostack conference (recap and slides here), a few people asked us for more technical details about how the bots4health bot works behind the scenes. We have been wanting to write about this (mostly as part of our expectations management or chatbot detox strategy) and this was the final trigger.

What does Eva do?

Eva helps users keep track of their progress in their new year goals. Each of the three available options has different sequences and dynamics, with “Eat Better” being the one we have dedicated the most love to.

Users who indicate they want to improve how they eat receive the weekly reminders (“How are you doing in reference to your goal?”) but also receive meal recommendations every few days, and a monthly message with seasonal fruits and vegetables and a recipe.

Besides this, Eva can keep some sort of conversation with people that are curious about automated conversations and with trolls. Eva is far from being as good at small talk as chatbots like Mitsuku, but she’s doing her best and she keeps learning new things every week. Eva has been programmed to answer to things like “give me recipe ideas” or “I love tacos” or “what is the meaning of life?”, and her small talk skills improve every day.

Food for thought: Each one of these answers is written by a human based on the expressions sent by users. More on the training process below!

Eva is a great example of changing our mind, pivoting, and iterating continuously. In the past two years, Eva has been a pill reminder bot, a women’s health bot, and a health bots newsletter. When we started we knew what we thought users wanted. Then we faced real users and our minds were changed over and over again.

I believe all these changes have made Eva a better bot. Each experiment had different successes, but each was interesting in some way. Eva is always changing as we try new interactions and we learn lessons we can safely apply in client projects.

What is Eva made of?

Eva lives in Facebook Messenger and is made with Chatfuel and it uses the Dialogflow NLP API to respond to natural language expressions sent by users. In this article, we will use the term “Expressions” instead of “Natural Language Expressions” because we assume users don’t come to Eva with Binary Code and we treat all programmatic expressions as strings Eva will ignore. Except for this one:

Conversation Flow

The main conversation is designed on Chatfuel. It consists of an Onboarding flow where users are introduced to the chatbot and asked about their health goals, and two different sets of notification sequences. The first set of notifications is sent to users that are interested in eating healthier every 3 days and comes with recipes, and the second sequence is sent every 7 days and asks all users that have set a goal if they feel they have made progress towards their health goal, and sends them a motivational quote.

Processing Keywords

A lot of the user inputs are not processed by the chatbot because at certain steps, like the initial onboarding, we want to focus on the happy path and avoid distractions. For example, when we ask users for feedback about their meals, we just save their answer and proceed to say thank you, give them their quote, and ask them if they have any question. This was a design decision. In other cases, like whenever Eva tells users she can answer their questions, we want to make sure users can start a conversation.

We have configured some keywords in the so-called “Chatfuel AI” so that they are automatically sent to a block of the conversation, and any expressions that we want to recognize and can’t be matched are sent to the Default Answer Block. In this block, we have configured an integration with a Dialogflow agent. The expressions are sent to Dialogflow, matched to an Intent, and returned to Chatfuel so that they are displayed to users.

Expressions and Intents

Intents are the different concepts Eva can understand. We currently have around 120 Intents in Dialogflow, and the list keeps growing as users have conversations with Eva and we discover more topics that are interesting to users.

Each Intent is related to many Expressions from users, over 1500! Each Intent also has a series of Answers that the chatbot will say in response. We use a naming convention for Intents to make the task of keeping them up to date easier, but we are always iterating in the way we manage this.

The current convention groups intents in families. Our naming convention uses a dot to separate the different sub-families. Some of the most relevant families are health (for health-related Intents), nav (for navigation Intents), and meta (for Intents about Eva, chatbots, and the meaning of life). We can match these and the rest of the intent families to the three different levels of the Conversational Design Pyramid Model:

Keeping this pyramid in mind helps us focus on the main purpose of the chatbot — on the main intents that give value to users and are part of our product design, and to try to assign new expressions to existing intents when it comes to navigation-related expressions. For small talk, we decide whether it’s something we want to train Eva to respond ad-hoc. Some things user say may be interesting to chat about, but some others are definitely out of our plans and we prefer to respond to them with a default response that encourages users to go back to the happy path.

Expressions and responses for a very popular Intent

A lot of you are a lot more comfortable using a text editor, so here’s good news for you: you can totally manage this in your text editor. This is how this intent looks like in code:

{
 “id”: “70213998–4eb4–4453–9515–10c6f502234b”,
 “name”: “nav.meta.areyouachatbot”,
 “auto”: true,
 “contexts”: [],
 “responses”: [
 {
 “resetContexts”: false,
 “affectedContexts”: [],
 “parameters”: [],
 “messages”: [
 {
 “type”: 0,
 “lang”: “en”,
 “speech”: [
 “Iu0027m afraid I canu0027t answer that question, Dave…”,
 “Not at all. Iu0027m here looking for Sarah Connor.”
 ]
 },
 ],
 “defaultResponsePlatforms”: {},
 “speech”: []
 }
 ],
 “priority”: 500000,
 “webhookUsed”: false,
 “webhookForSlotFilling”: false,
 “lastUpdate”: 1537487696,
 “fallbackIntent”: false,
 “events”: []
}

Entities

Eva is trained to recognize certain words as items of an Entity. These Entities can be passed upon fulfillment to execute actions in a 3rd party service or to answer the user using that word. The different lists of entities grow over time as we identify new needs.

Named Entity Recognition (NER) is a fundamental area of work in Natural Language Processing. Through this task, a computer can understand the topic of a text or a sentence, identify the actions the user is asking it to do, and more. It does it in two distinct activities: first, it identifies the names in the sentence (which can be single words like “leg” or multiple words like “digestive system”) and then it classifies them based on an ontology pre-loaded in the system (eg. our existing Entities). NER is still far from perfect, it relies on manually tagged data and there are painful challenges, like disambuguation (eg. different Entities represented by the same words), that still need to be resolved.

We manage our Entities in Dialogflow, where we have configured most Entities for Automated Expansion so that we can take advantage of the Dialogflow superpowers. Thanks to this configuration, Dialogflow will apply its magic (mostly statistics) to identify words that are similar to the ones we added in the Entities definitions, and add them to the collection so that it keeps working with as little manual maintenance on our side as possible.

A huge list of words for an Entity called bodyparts.

This is how each Entity looks like if you manage them from a text editor:

{
    "value": "Column",
    "synonyms": [
      "Column",
      "vertebrae",
      "Spine",
      "Dorsal Spine"
    ]
  },

Sentiment

One of the main drivers for Eva has been learning how users interact with chatbots and how we can use conversation design to drive empathy and ultimately, engagement. Paying attention to how users feel during conversations has been at the top of our priorities since day one.

Over time we have identified different positive and negative expressions from users that are related to sentiments. We have been created Intents for each of these Expressions to respond with something that makes sense: we apologize to users that are disappointed and respond with love to users that send Eva love. Eva can respond to laughs: she may ask you to share the chatbot link with your friends now that you are happy, or just respond with another laugh or an emoji. We have programmed her with different answers to each of these sentiment intents so that she is always somehow surprising for users who are having a more personal conversation with her.

In the future, we would love to have an excuse to add Sentiment Analysis APIs to Eva. If the conversation volume justifies the internal development cost this is something that could be very interesting to do. For now, Eva’s pre-programmed answers are working just well.

Sentiment related Intents, a step before Sentiment Analysis

Training Eva

When we identify expressions that don’t match an existing Intent we need to decide if we want to redirect users to the Default Answer or to another Intent, or if we want to create a new Intent to satisfy future similar expressions.

Training an agent to recognize a new Intent from an unmatched user Expression

Intents created from the Training interface only have a name and the origin Expression, so we add “aa” at the beginning of the Intent name so that it is positioned at the top of the Intent list. This way, after we review all unmatched Expressions in Training, we can go to the Intents list and start the work at the top of the list.

Training a chatbot for NLP means dealing with these 10 common challenges. Some of them can be easily addressed once we are aware of their existence (abbreviations, for example, can be easily addressed at the beginning of a project with a little domain and user persona research) while others like context and feature discovery will continue to be a challenge for a while longer.

Analytics

We run analytics in Chatfuel and in Dialgoflow. Each tool provides different information, and we do some data processing on Google Sheets for the indicators neither of the platforms provides us with. These indicators are usually the closest to the users, so we put a big deal of attention in this manual analytics. Mind the italics: We take advantage of templates and the great Explore feature in Google Sheets, so a lot of the work is actually automated or somehow computer assisted.

Dialogflow analytics are restricted to the expressions that can’t be resolved by the Chatfuel AI, so we use them specifically to evaluate the NLP performance. A bit of information we love is the flow view that shows how users move from one Intent to another and that we use as a measure of improvement of our natural language understanding. Don’t take this too seriously, though: users will always think of new, fun things to ask to your conversational interface.

Chatfuel gives us more general analytics and is actually very useful when we need to understand how things are going. Analytics we extract from this tool include the total number of users, the new and blocked users per day, the active users, divided by type of interaction, and the number of visits to each block of the conversation.

We pay the pro version of Chatfuel for two reasons: First of all, the PRO version lets us remove the branded message at the beginning of our conversation. In the second place, but perhaps more important, PRO gives us access to the People view.

The People view lists all users of your bot and their system and custom attributes. You can filter users and save segmented views for easy reference, and you can download all the data as a .csv file to run a manual analysis.

Once you reach this point, the sky is the limit! With all your attributes in .csv format and the right tools (I simply use the Explore feature of Google Sheets), you can run your own analysis and print progress charts.

Conclusions

Back in 2016 the most popular claim used to sell chatbots was “They’re so cheap to build!”. The second most popular was “It’s AI!”.

Today, luckily, many companies learned that in most projects AI is only applied to the Understanding part (as opposed as the Generating part) of Natural Language Processing, and that building a chatbot is just the beginning — training can become significantly expensive with time.

Our intention with this article was to contribute to make this point clear and to manage your expectations. We hope this is the insight you need to take the step and bring conversations to your organization.


If you want to learn more about how conversational interfaces are built…

Join us at Intentconf — a conference about conversation.

Follow us on twitter

Visit our website to learn more about how we can help your business.

Pin It on Pinterest

Share This