Zheng @ playpoly SD 20081108
The following is my family's statement for your reference only.
Introduction
Google, Baidu, Yahoo, and new launch are all testing more types of onebox. For example, if you search for "population of China" in Google,"China-Population: 1,321,851,888 (July 2007 est.) ". Here we call the onebox mode aggregation.
Aggregation in search engines, from simple search result aggregation to simple information aggregation, until the current semantic aggregation, let people see the dawn of integrated search.
Search and aggregation are two sides of a thing.
Search provides information reference. Aggregation provides exploration paths for those who have no purpose, and provides organization and knowledge for those who have purpose. Both of them need to guess the goal as much as possible.
It is said that the searcher often does not know what the searcher is doing. Therefore, the searcher can only provide search results for thousands of people. Miguel Carrasco recently suggested that Microsoft Live Search should make good use of people such as Facebook.CommunityIdentity and activity information in advance to grasp the purpose of user input keywords, that is, integrate SNS and search. This is also true for Google's integrated search, which aims to provide the desired answer after understanding the user's needs as much as possible.
After learning more about the target, the aggregation in search will play a greater role in transferring knowledge.
Search and merge into one.
The aggregation in the search, which is independent of each other, can be a product, and the splitting aggregation capability displayed in Google onebox. For example, the keyword "Lee Kai-Fu" is used to search. in Google's search results, the first page first shows the onebox of the video search results, and then displays the news search results.
The Internet search and aggregation technologies familiar to most netizens still rely on relatively simple pattern matching: match the search keywords with the words on the web page, and rank the search results based on some factors, for example, the frequency of words to be searched, the location of words to be searched, or the number of links to a specific result webpage.
Therefore, the information structure displayed by Baidu, Yahoo, or Google onebox is only information search results, or simple aggregation of data in different fields, such as searching stock names orCodeThe real-time stock market is displayed.
Vertical aggregation and Semantics
Next, in order to enrich the integration of onebox, the search engine must go into every vertical field.
Different vertical fields have different features. For example, a user may search for a restaurant name (for example, you search for "Quanjude roast duck store Beijing" on Google "), in order to intelligently aggregate the information users may need, instead of simply listing maps and contact numbers,Then the search engine must be in this vertical field of life search, or cooperate with others..
Like Google product search, Microsoft Live Search has already embedded its product live search into the search results, but lacks the integration of online payment such as Google Checkout. If you enter N95 or G10 for search, you will see that in addition to product images, price ranges, price comparisons, and so on, the onebox also displays product comments and scores. For N95 mobile phones, many user comments are listed as classification indicators: general comments, features, ease of use, battery life, sound quality, etc. For G10 cameras, the image quality, lightness, and size indicators are listed. This details shows that the vertical line is deep enough.
The Semantic Feature of Microsoft Product Live Search is that it automatically summarizes the comments of these users concerned about indicators and calculates positive and negative emotional factors, in this way, we can list the positive comments of Nokia N95 cell phones whose battery life is only 19%, and the positive comments of pricing affordability are only 64%.
These are actually the power of semantic aggregation. How can this problem be achieved?
In the semantic aggregation engine, each query is executed within the context of some ontology, and some prompt information from the ontology can improve the accuracy of the search.
What is ontology? To put it simply, the ontology provides the basic terms and relationships for forming vocabulary in related fields, and defines the rules for determining the extension of words using these terms and relationships; the goal is to capture the knowledge of relevant fields, provide a common understanding of the knowledge in this field, determine the general words in the field, and give a clear definition of the relationship between these words and words.
In semantic search, concepts are matched, that is, the concepts of documents are automatically extracted and indexed. With the help of the system, users can select appropriate words to express their information needs, then, implement Conceptual Matching between the two, that is, matching words with the same semantics, similarity, and inclusion.
The basic design idea of Ontology-based intelligent aggregation engine is as follows:
(1) With the help of field experts, establish the ontology of relevant fields;
(2) collect data from the information source and store the collected data in the metadatabase (relational database, knowledge base, etc.) in the specified format with reference to the established ontology:
(3) For the query requests obtained on the user search interface, the query converter converts the query requests to the specified format according to ontology, matching a qualified data set from the metabase with the help of ontology;
(4) the retrieval result is customized and returned to the user.
With this model, you can copy it to different vertical fields. After completing the tasks in different fields, you can easily embed them into the search results.Of course, it is necessary to obtain the user's goal as much as possible. Playju.com has an intelligent semantic aggregation application framework and has made some attempts in the stock field, such as summarizing stock reviews and expert reviews. This is also a bit of the meaning of the tendency to read more and look empty in computing, and the suggestion of buying and selling. In this sense, Microsoft acquired the Semantic Modeling expert powerset and did not deliver shells to its own search.
After onebox + vertical aggregation and syncing semantics, you can see the dawn of integrated search. Next, you need to see how to integrate the identity information contained in communities such as SNS, in order to better understand the changing search requirements of users at any time.
Zheng @ playpoly SD 20081108
Webmaster Z weekly draft URL: http://www.chinaz.com/z/