This is the fourth and final post in our ‘science behind the Raffle-lution’ series. It’s recommended you first read part 1, part 2, and part 3.
Deep learning-based systems are replacing the traditional chatbot approach to conversational AI. With traditional chatbots, you have to engineer the whole conversation, a labor-intensive task that is vulnerable to changes in the challenge we want to solve.
Deep learning, on the other hand, learns from data. So now the challenge is how to design a system that can learn from a conversational context and how to collect training data.
In our previous four posts, we discussed the technology we use in our tools in detail. Now we’re going to look into our Artificial Intelligence crystal ball to come up with some predictions on what we can expect in the next few years.
What we need from Conversational AI
Let’s first examine what a conversational AI should be able to master before it’s ready:
- Contextual awareness. The AI should be able to take what has been said before and other contexts into account when formulating an answer.
- Accuracy. For raffle.ai’s products, it’s absolutely essential that the answers we provide are accurate.
- Close to unbiased. Some sort of bias is hard to avoid, but it is much easier to avoid unwanted biases if our training data is fully maintained by ourselves. This goes against the trend of leveraging large public datasets.
- Master new domains with little need for adaptation. We want to spend as little resources as possible on collecting training data when onboarding a new customer.
- Handle large knowledge bases. The more information our tools can handle, the more useful they are.
- Reasoning. The ability to perform logical reasoning is not a must for making conversational AI useful, but it definitely makes the experience more enjoyable because it gives the user a sense of being understood.
Is GPT-3 the solution?
GPT-3 from OpenAI is an example of groundbreaking NLP research from 2020. A one billion dollars investment from Microsoft, the GPT-3 language model, has 175 billion parameters and can only run in the cloud. It is trained on a substantial fraction of the entire internet. The results have been impressive but come with serious limitations and biases.
It is a generative language model that can read thetext and, from there, start generating more text, one word at a time. The provided text sets the context, called a “prompt,” in GPT-3 lingo. So if we want to make a question-answering system on the subject of raffle.ai, then a prompt could be:
“raffle.ai is an AI company. Our mission is to help businesses and customers release time using AI-powered search.
Q: What is the AI technology used by raffle.ai?
A: raffle.ai uses deep learning NLP.
Q: What can raffle.ai Customer Service products be used for?
A: raffle.ai Customer Service assists customer service agents in answering customer tickets.
Q: Is raffle.ai a chatbot?
A: No, raffle.ai is an AI-powered search engine that has a contextual understanding of a question leading to instant results and solutions.
Following this prompt, GPT-3 will answer the third question (“No...”!)
Since GPT-3 is trained on a lot of the public internet, it will have stored information about raffle.ai in its 175 billion parameters. It has also stored a lot of information about AI in general. It combines both to give a sensible answer to the question.
The context set by the prompt makes the model “tune in” to the information stored in the model relevant to the raffle.ai context. The list of questions and answers we use in the prompt is not training data but instead a way for us to set a question-answering context.
Recently, OpenAI released another spectacular example of how one can scale up models and use data from the internet to make Artificial Intelligence that can generate non-trivial new content. DALL-E — a tribute to both Salvador Dali and Pixar’s WALL-E — generates images from text prompts. It uses a model similar to GPT-3 trained on billions of images from the internet, including both the image and a caption.
The truth is out there… but so are the lies!
So GPT-3 understands context, masters new domains without explicit training, and can handle large knowledge bases because it has picked up a lot of information from being trained on much of the internet. But has it met all our conditions for conversational AI?
The answer is still, unfortunately, a resounding no.
We have no control over whether answers are factually accurate. GPT-3 makes up things based on what it has read. GPT-3 has the same biases as the data it has been trained on.
The GPT-3 developers have tried to filter unwanted content, but the internet contains a lot of information, and a lot of it might be said to have biases or plain untruths. Another aspect is that if we scale language models, they can memorize more and more. They risk becoming stochastic parrots generating a distorted version of the original content they were trained on. Hopefully, this criticism can lead to a community-driven effort to curate what data should be included when we train publicly shared language models and methods for systematic testing for a range of biases that most can agree should not be there.
It is quite easy to pressure test GPT-3 to expose its limitations when it comes to reasoning. However, if recent history in the development of AI has told us anything, then it’s that this can also be solved.
Two new extensions of our current favorite work-horse, the transformer — the switch transformer and the feedback transformer — aim to make the model larger or give it the ability to internally process the data for more steps.
The switch transformer from Google Brain Research describes a 1-trillion parameter model that routes different data to different model sub-components. This allows for a much larger model without requiring more computation.
The feedback transformer from Facebook AI rewires the transformer so that the model has access to high-level, more abstract representations from the previous step at each step in the computation. This makes the model somewhat slower, but this can be compensated by the fact that the new formulation performs better with smaller models. Combined, such approaches can take natural language models to the next level.
The future is bright for Raffle
Still, GPT-3 can have many uses, as exemplified here, and OpenAI has committed to making it available soon. At Raffle, we could integrate the GPT-3 API to help generate training data. We still need human gatekeepers to validate the data, but we can outsource some creativity in capturing human variability to the AI.
GPT-3 has demonstrated that we can make context-aware AIs by scaling up language models and that we might not need to train models in the future but instead simply prompt them to put them into the correct context.
As discussed above, GPT-3’s shortcomings are currently inspiring a lot of research. Perhaps the dense index search we discussed in an earlier blog post is the most promising way to connect conversational AI to the knowledge we want the AI to use. So we are confident at Raffle that we’ll use breakthroughs in AI to make our products more intelligent, easier to implement in new settings, and able to work with ever larger knowledge bases. For raffle.ai, the future is just beginning.
So we are confident at Raffle that we’ll use breakthroughs in AI to make our products more intelligent, easier to implement in new settings, and able to work with ever larger knowledge bases. For raffle.ai, the future is just beginning.
Want to hear how our AI products can help optimize your business?