Virtual assistants live on our smartphones or our countertops, and are built to help answer simple questions and perform routine tasks like ordering groceries, setting calendar reminders, or messaging friends.
We envision a future where conversational AI technology supports a diversity of of next-generation digital experiences. Here at Chata, we’re developing conversational AI technologies specifically for database access.
In this article, we’ll provide an overview of the current state of conversational technology and discuss how we can iterate and improve on today’s innovation to create the powerful conversational AI systems of tomorrow.
Inside the Chatbots of Today
Most of us have experienced chatbots built into the apps, software, and web pages we use on a regular basis. Chatbots are typically available through a conversational user interface that was developed to mimic a messenger or SMS conversation with another human. There are three common types of chatbots: rule-based chatbots, AI or machine learning-based chatbots, and hybrids of the two.
Rule-based chatbots typically leverage if/then rules that establish what kinds of problems a chatbot is programmed to be familiar with and match user inputs to pre-determined outputs. This requires building chatbot “scripts” that traditionally behave like a “choose your own adventure” story where users navigate through options in a series of decision trees.
While there’s no need for developers to train an AI system on a mass volume of data to set up a rule-based system, these chatbots require extremely specific inputs from users and often don’t provide a natural-feeling conversational experience, even though they may provide utility to an end user.
On the other hand, machine learning-based chatbots usually leverage intent classification methods which enable these systems to handle a much broader range of user queries, that is, questions or statements posed in the user’s own words, or natural language.
Natural language refers to human language, as opposed to “languages” that computer systems are programmed to understand. AI chatbots can provide a flow of dialogue that feels more human-like by processing users’ natural language to gain a level of understanding about what a user is asking or trying to achieve.
Along with intent classification, slot filling is another method that is leveraged by many AI-driven virtual assistants and chatbots of today.
When an intent is known (or classified) by a system but certain pieces of information are missing, a slot-filling system will direct the user to input these missing entities (words the machine is trained to recognize) to fill those pre-defined slots.
For example if you ask: “What’s the weather in New York?”, the system uses machine learning to match [weather] to a pre-defined intent category like “find weather” and fills the entity [New York] into the slot designated for “city” or “location”. Instead of creating a specific intent category that only allows a user to find the weather in New York city, the intent “find weather” can apply to other locations as well, thanks to the slot into which any city name entity can be filled.
Another example is making a reservation at a restaurant. “Make a reservation” would be classified as an intent — it’s the goal the user has or the action they wish to take — but entities like restaurant name [Joe’s Italian] and date and time [Friday, 7:00PM] must also be extracted from the user’s query and matched to their respective slots before the reservation can be processed by the AI system.
In the next section, we’ll provide a general explanation of how most intent classification systems work and why the limitations of this method prevent it from being a truly viable solution for facilitating conversational access to data.
On Intent Classification in Conversational AI Technology
Behind the scenes of most modern chatbots is an AI intent classifier. Intent classifiers perform the function of recognizing intent in a user’s natural language (NL) question or statement – again, the thing the user wants to do or accomplish – and categorizing that intent in order to return a relevant response.
AI intent classifiers can analyze statements like “How much does an annual Premium subscription cost?” These systems leverage both natural language processing (NLP) and natural language understanding (NLU) to deduce that words like “buy” or “subscription” are likely to indicate that the intent of this message is purchase-oriented.
The AI chatbot needs pre-defined intent categories in order to classify intent. These categories must be tailored to a specific subject matter or the unique purpose that the chatbot is built for.
If the chatbot is being employed for customer service at a SaaS company, intent categories might include [needs help], [demo request], [downgrade], [upgrade], or [card expired]. For a hotel-booking chatbot, intents would be different and might include things like [make booking], [cancel booking], [change rooms], and [change travel dates].
Once appropriate intents have been defined, the AI system is trained to correctly match or associate them with a variety of different words that a customer might use. This is where machine learning comes in.
A large volume of example data–known as training data–is needed to teach the intent classifier to learn how to match human words to pre-defined intent categories.
Intent classifiers can be combined with other machine learning methods that help facilitate an efficient and rewarding user experience, such as models that enable a system to understand some amount of context in natural language, or make predictions about what users may need.
But the ultimate goal of conversational AI technology is to close the gap between computers and humans, not to create a mediocre substitute for human-to-human interactions.
Considering that, intent classifiers are simply not a scalable solution for facilitating flexible, dynamic user experiences that feel as intuitive as a conversation with another person and yield the same results: namely, getting the exact information you’ve asked for instantly and easily.
In the next section, we’ll talk more about why innovating beyond intent classification is a critical next step towards improving digital conversational experiences.
Read more: How to Talk Data with Conversational AI
Going Beyond Intent Classification
Let’s look at an example of how humans might typically seek out information by considering the sentence “Who owes me?” This is the type of question that humans are equipped to answer, but difficult for computers to understand. “Who owes me?” is a question asked in context and, though it might refer to a specific answer, there’s a lot of ambiguity around what that answer might be.
In other words, the person asking the question has an idea of the information they’re looking for, but they haven’t clearly stated what exactly is owed. Maybe they’re asking you to tell them who hasn’t paid them back for dinner last night, or maybe they’re asking which of their clients currently have outstanding invoices.
In this example, the entities that would be needed in order to classify the intent of the statement are not present.
Humans can deduce intent by factoring in real-life context (what the intention might be while talking to a friend versus speaking with the accountant).
While an intent classifier might have some level of context if it’s built for a specific purpose, say, consumer banking, it can only match a phrase to an intent that it already knows exists, thereby limiting its flexibility. Unless every single possible intent is built and trained into a given intent classification system, there will always be gaps in the machine’s ability to understand what a human is really saying.
Another drawback of intent classifiers is limited NLU power.
The AI system only has to understand human language insofar as it can apply an intent category to words that match, or can be associated with, its pre-defined list of intents. That means users need to adapt the way they ask questions — their natural language — so that the AI is more likely to understand.
This can sometimes feel like a tedious, even frustrating, game as users attempt to guess the words that the computer might know, while the system continuously returns the old refrain: “I’m sorry, I don’t understand what you’re asking for.”
At Chata, we see a gap in AI technology aimed at making enterprise-grade database access faster and more intuitive. While intent classifiers have been built to provide better experiences in customer service and marketing channels, they don’t offer the flexibility and level of intelligent understanding that users need to access and successfully leverage their data through conversation.
This is because conversational AI technology built specifically for database access requires a level of complexity that can’t be achieved through the limited application of intent classifiers: it’s simply too labor intensive to create and train the massive volume of intents that database users are interested in exploring.
Behind this limitation is the sheer volume of training data that would be needed to encapsulate the scope of an entire database, understand the business logic tied to that database, and handle every type of question a user might ask about their data.
To fill this gap, it’s necessary to go beyond intent classification and instead, develop AI specifically for conversational data experiences. In our next post, we’ll discuss what it takes to build robust conversational AI technology for database access.