Abstract: Search engine refers to a certain strategy, the use of specific computer programs to collect information from the Internet, in the organization and processing of information, to provide users with retrieval services, users to retrieve relevant information to show the user's system. When the user searches
Search engine refers to a certain strategy, the use of specific computer programs to collect information from the Internet, in the organization and processing of information, to provide users with retrieval services, users to retrieve relevant information to display to the user's system. When a user enters a keyword in the search box, what content should we return to the user?
First, the search engine principle and user usage habit
1.1 Search engine is a database for everyone to retrieve
Figure 1: Search engine simple human-computer interaction process
which:
1 The retrieved database is the Web page data crawled by the search engine.
Crawl through the spider to the original data, the search engine will be processed before the storage. Search engines, such as well-known names (of course, the name of the principle is the highest secret) of Google's PageRank.
2 Search engine is a highly simplified product.
What the user needs to do is enter the keyword that you want to retrieve, OK, and see the results. One of the things to note here is that users do not need to enter a search condition. In terms of search engines, not only to find relevant results quickly in massive data, but also to speculate on the user's expectations and extract the right content to the user, the internal mechanism can not be described with cumbersome.
This difficulty is like finding the answer to an unknown problem quickly and accurately in a large number of books.
Figure 2: The newly-filmed National library uses filters.
1.2 Search engine Data processing process
Search engine is a super complex system, the internal specific processing rules and technical principles can not be simply explained clearly. We understand the process through the thinking of the product. Take the example of writing a paper analysis can be, the paper before the written material processing process is as follows:
1 from the network, library, book magazines, lectures and so on to collect a large number of raw materials
2 excluding the relevant duplicate content
3 Exclude content that is not relevant to the topic
4 According to the theme, logical order, priority and so on to carry out artificial calculation, analysis, typesetting, processing and so on. This process is the most cumbersome and time-consuming, the use of weapons is the most brilliant tool in history: the Human brain!!!
5 written input results
Can not help but reiterate: all products are actually in imitation of the real social activities of mankind ... It's important for product managers to understand this.
Search engine Data processing flow is basically similar (want to know to be able to search the relevant data by oneself), the only is also the difference that search engine wants to eliminate:
One is the emotional and logical brain in the analysis, one is the machine according to certain rules to analyze.
So, if you want the search results to be more accurate, let it analyze the input data and enter the results like the human brain.
Well, I don't think it's realistic, but I can find a way to make him more accurate.
Ii. Ways to obtain information
Let's start with the day-to-day behavior and then deduce how the product works.
2.1 Usually, we get information from the surroundings as follows:
1, known access and methods
If you want to know today's dollar against the renminbi or Beijing fly to Qingdao ticket prices and timetables, because the way known, such information as long as the retrace can. The difference lies in the cost of different channels. Exchange rate through the network inquiries, telephone consultation, bank network inquiries, and so on, obviously the first method more convenient. (It's really nonsense).
The information is rule-based and conceptually clear.
2, understand the core key needs to be sorted
As mentioned in the paper writing, assuming that the topic for the weak relationship community design, we need to ask what weak relationship, and strong relationship between what is the difference between the existing design cases.
The acquisition of these information is based on the premise of artificial analysis.
2.2 Ways to ask questions
Give two examples.
1, before the formation of a complete preamble logic, the way children ask questions is the simplest keyword, the adults have to do is through his babbling to understand the needs of children. Most adults can predict accurately, because they know their children's habits, behaviors, ways, characteristics and so on.
2, with the complete language logic, we generally choose to ask directly: what is the exchange rate today? What are the fares of Beijing flying to Qingdao? The human brain can also handle these problems completely. Of course, people are complex emotional animals, a lot of things can not be fully understood by literal meaning. Say a less-than-appropriate example: on a date, a girl asks you how you feel about the current price. The real price, the potential meaning of your purchase capacity.
2.3 Search engines should be handled this way
If the search engine has the same brain as us, then the way he handles the problem should be:
1, the analysis of the query is to search for keywords or questions
2, the result is divided into three kinds,
The answer is known as direct output;
The way is known, the input solution;
To provide the most user-expected sorting results of the total user selection
3, different circumstances will appear in combination. The more accurate the search engine has for the keyword, the better the result.
Iii. improved methodologies and strategies
To summarize the user's action behavior:
3.1 When the user enters the keyword:
1 The characteristics of known users, according to their characteristics of the search results in accordance with their own sorting
2 Unknown user characteristics, it is considered as a normal query. Provide the search results of the structure, that is, a hint of relevance, the higher the correlation, the more the results.
3.2 When the user asks questions:
1 analysis of the semantic of the question, simple semantic output results or ways
2 cannot analyze the exact semantics, provide multiple results to the user, and adjust the results according to the user's feedback. This is also part of the user's characteristics.
3.3 When the search results intersect, pain still need to refer to the user's behavioral characteristics to sort the results.
There are a few nouns, interested can go to search again: baidu-box calculation; google-knowledge map; facebook-Social Atlas Search; siri-semantic Search, probabilistic-Markov model.
To be blunt,
Search engine to the user's search intention more understanding, the database is more complete, the output of the more accurate results.
For example: The same problem, the answer to a good friend is generally better than a stranger, because a good friend knows more about the motives of your questions, the background, and even the desired answer.
The problem is, the computer is not a creature after all, he only executes the rules. What you can do is collect some of your behaviors and characteristics to infer your preferences:
1, Personal information: name, sex, origin, occupation, industry, hobbies, use preferences.
2, personal behavior: Search records, browsing records, social behavior, etc.
3, processing methods: Clustering, classification, data mining
Well, it's actually a recommendation engine. Learn more about the IBM Developer article: Explore the secrets of the recommendation engine.