/* Copyright Notice: Can be reproduced arbitrarily, please indicate the original source of the article and the author information. */
Author: Zhang Junlin
This paper discusses how to construct a semantic search engine using deep learning system. The so-called semantic search refers to the ability to do semantic level matching between user queries and search pages, for example, when a user enters "iphone", although an article says "Apple is trying to make a new phone", but does not explicitly say the words of the iphone, So, even so, you can find this article. Traditional search engines are powerless because they are basically based on a sort of literal match, and no result of a literal match will be searched, even if the two are semantically very relevant.
Search engines are usually used by everyone, so this piece doesn't have to be said. But technically, the search is doing something. This needs a brief explanation. The so-called search, that is, we have the assumption that there are 1 billion pages, the formation of a document Set D, the user wants to find a message to the search engine to issue user queries query. The search engine is essentially the degree of correlation between each page in the collection D of a document and the user query, and the more relevant it is in the forefront of the search results, the results are formed.
So essentially the search is for two messages: Query and a document DI, to calculate the degree of correlation between the two. The traditional method is to look for the characteristics of the overlap between the degree of judgment (this article we do not consider the link relationship and other factors, purely consider the text matching angle), such as TF.IDF Ah, query words in the title appears ah and so on characteristics.
In other words, for the search, it can be transformed into a sentence to match the problem, that can be understood as follows the search problem:
This means that given a user query and an article, after mapping the function, give the two related or irrelevant judgment, or give a different degree from 1 to 5 of the classification results, 1 is irrelevant, 5 represents very relevant:
That is, given doc and query, we use neural networks to construct mapping functions and map them into 1 to 5 score spaces.
Before we summed up several common RNN-based network structures for sentence matching problems, this paper presents a neural network structure which regards search as a typical sentence matching problem and considers it as a special sentence matching problem. We can call this RNN-based semantic search neural network structure called Neuralsearch structure. As far as my reading field of view is, I have seen CNN do the search problem and have not seen the RNN structure. Of course, the neural network to do the search is relatively small, this may be too short to query the document too long has a certain relationship, as to whether the method described here is effective I do not know, this article just to share this idea.
Before the RNN structure is given, a very simple and non-supervised semantic search model is given.
| An unsupervised simple semantic search model
In fact, if you want to do semantic search through neural networks, there is a very simple way. Here is a description of its working mechanism.
Figure 1 Abstract Semantic search structure
Figure