Summary in English:
Danish startup is building a search engine with BERT
Supervised learning does not scale well enough when a search engine has to review a large company's knowledge base. That is why Danish raffle.ai has adopted the language model BERT.
While Google users can find relevant results, the development of enterprise search has been significantly slower. In companies, it can still be challenging to search through a wealth of internal files on unmanageable intranets.
That was the challenge that inspired Suzanne Lauritzen to start raffle.ai.
"It started as a big frustration for me that I could never find anything on the company's internal drive"...
Does not scale
Raffle aims to deliver the next generation of enterprise search technology, says Suzanne Lauritsen. And just as was the case in Google's search engine, BERT is one of the keys to that goal, says KU and DTU professor Ole Winther, CTO and co-founder of Raffle.
"Traditional search engines work roughly by matching words in your query with words in your knowledge base. It is a proven approach that works well a long way down the road. But it is difficult to make it better, "he explains...
The breakthrough for Raffle came from basing the search engine on a pre-trained BERT model, which provides an understanding of the context in which words appear.
"The model reads the entire text, which means you can have a search string that finds the correct text, even if none of the same words appear. The model understands if the user writes "support," that one is not just looking for the word support, as traditional search would do," says Ole Winther.
In addition, the model can be trained self-supervised on non-annotated text, while the hard-to-access annotated data with questions and answers should only be used to fine-tune the model...
“At Raffle, we spend a lot of our research time on zero-shot. Our search engine must work well out of the box without using training data, especially for the new customer, "says Ole Winther.
"We do this with transfer learning, where we fine-tune question-answer data we have already collected, and with a new method, we are in the process of getting patented."
According to Raffle, in a zero-shot search situation, the system will provide the correct answer to a search among the first three results in 60 percent of cases. It should be compared to a level of about 40 percent for solutions like Algolia and Elastic.
Read the Danish article here.
Want to know more about Raffle?
Let a product specialist demonstrate the unique benefits of our AI search engine for businesses.