Deficiencies in search engines in my eyes and improvement strategies

Source: Internet
Author: User

We use se every day, but there are still a lot of questions about Se, so let's discuss it.

 

1. Redo is not good.

For example, to retrieve the topic [compile the android kernel], there are only two different results ~ Three types (only the first 10 pages), but many of the retrieved contents are repeated, but they are only reprinted by different people on different websites, there is no such thing as simplifying the information retrieval, which makes us a lot of trouble.

 

Solution I expected: implementation of comprehensive reading

 

In the process of retrieval, se naturally collects information based on the search vocabulary, and then aggregates the information into a new one.ArticleIn the early stage, this article may not be a complete article written by people, but more like a tree entry like an encyclopedia. in the later stage, with the improvement of machine intelligence, this article may become an article written by people. But how can this article be obtained? I think clustering for individual keywordsAlgorithmIt should be very effective, and then combine the clustering quantity and rank of the weight article into a new article.

 

2. Word and word Separation

The complete sentence is searched in the SE and the result is actually the result after word segmentation. It is very rare that the sentence itself exists. However, when we use the entire sentence search, we can retrieve many of these sentences. I think this problem is caused by the fact that se does not understand the word and word carefully.

 

The solution I expected: Dynamic Planning

 

In fact, the problems we want to solve are very similar to matrix concatenation. They are all sequential. For example, "I am a number from Heilongjiang", the word segmentation of this sentence (including stopword) yes (I, yes, from Heilongjiang, and numbers ). Okay, then we will retrieve the results of each word separately, and then the retrieval distance is 2, 3... Then, the larger the value of K, the higher the weight given by K, so that we can get the effect of combining words with words. The better result is that we can analyze it in the form of natural language to obtain a higher probability that some of them can be combined to adjust the weight!

 

Write so much for the time being and add it later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.