Microsoft Object-level vertical search technology: obtained after search (Source: Internet weekly)

Source: Internet
Author: User

Source: Internet weekly

General search engines are becoming increasingly unsatisfactory in some aspects. So how can we make the search results clearer? Researcher Yan Zaiqing and Wen jirong from Microsoft's Asia Research Institute introduced to Internet weekly an object-level vertical search technique that they studied successfully ).

Reporter Li Yang

Mr. Li wants to buy a smartphone and wants to know the introduction, price, and evaluation of several products. However, when he used a general search engine to search for information, he got a wide variety of results. A headache is that he must go to the link one by one, register a bunch of Forum accounts, and combine all the information he sees to obtain complete information about the mobile phone.

Yes, generic search engines are becoming increasingly unsatisfactory in some aspects. So how can we make the search results clearer? Researcher Yan Zaiqing and Wen jirong from Microsoft's Asia Research Institute introduced to Internet weekly an object-level vertical search technique that they studied successfully ).

Clear results

It sounds a little academic, but it is not hard to understand. When you use such a search engine, the results listed by it will be a collection of final objects, rather than a messy page list. Everything revolves around the object you search.

For example, when you search for "dashboard", the system does not list various page titles and content searches that contain this information, but displays mobile phone numbers one by one, in addition to visual information such as models and images, each product also lists the introduction, price, user rating, and other information, just like the display page we see on a shopping website, but the content is far richer than that displayed on a website, because it is from the entire internet. The object content listed by the search engine is not manually organized, but a "virtual" Page formed by a computer through automatic crawling and automatic classification.

This technology has been initially applied. When you enter the keyword "Data Mining" in the academic search (libra.msra.cn/) of Microsoft Asia Research Institute as an experiment, you can obtain the ranking list of related papers. The results are arranged based on the papers. Each paper can list the cited times and authors. Click the link of each paper, you can see the introduction of this paper, the original links available for browsing and downloading, and related reference papers.

At the same time, there is a ranking of related authors, meetings, and journals on the left side of the result list of the paper. If you search by author, the system automatically lists the most authoritative scientists in the data mining field, you can also search for related meetings, journals, and academic communities. At present, Microsoft's academic search is limited to the computer field.

Compared with text-based search results, Object-based search results are clearly clearer and more vertical and professional search results are achieved. Currently, this technology from Microsoft's Asia Research Institute is being applied to beta version development of the Windows Live product search engine (http: // products.live.com. In the search results of products as objects, users can also rank by relevance and price, or search for a website based on some hot spots.

According to researcher Yan Zaiqing, after the first month of trial operation, the system has automatically found 0.1 million

E-commerce websites have tens of millions of webpages, and hundreds of millions of commodity object information are extracted from these webpages. This number cannot be independently implemented by any merchant platform, in the future, it may become the world's most comprehensive product directory library. The object-oriented search engine is undoubtedly a basic platform that spans many shopping websites.

Core Technologies

So how is this technology implemented? Careful readers may see the clues from the previous introduction, which is a new architecture different from the traditional search engine idea.

First, it relies on Web Crawler technology to capture all relevant webpages in a specific domain (such as a camera product. After obtaining these pages, the system will classify the object information types contained in these pages. That is to say, it will determine whether a webpage is a thesis or a blog page, or a product information page.

After this is done, the system can integrate the content into the object information warehouse in different categories. This work requires a lot of training and Model Construction beforehand. For example, in the training of a product page, you need to tell the system what is the product name, product pictures, and price, the system can automatically find the key content it wants.

In this field of research, some people have tried HTML code, but they are not very successful, because the code compilation is always very different, however, the page display implemented by them is almost the same. After discovering this, researchers at Microsoft's Asia Research Institute cleverly integrated visual analysis technology to design algorithms that enable computers to be like humans, view the "interest center" of a page and make intelligent judgments.

After capturing, classifying, and extracting, the vertical search engine can use the structured object information to respond to users' questions and perform various Intelligent Analysis and mining work.

Overturn existing architecture

Such a technology is revolutionary. You can use it to perform in-depth searches for various vertical classes. After the architecture is overturned, it will undoubtedly undergo more tests than traditional search engines.

For example, it is necessary to ensure the high quality, comprehensiveness and accuracy of structured information. In addition, because the search engine database needs to store object information in offline databases on the Internet and across regions, it also challenges the scalability of storage and operation scale, this technology must have a "super database" that can store hundreds of thousands of data records and ensure that its algorithms can be searched quickly enough.

While bringing convenience to users, the new search engine brings a series of potential changes in business models derived from changes in technical rules. For example, in an object-based search engine, the traditional PageRank method is no longer applicable. Researchers at Microsoft's Asia Research Institute proposed poprank ).

This search technology is widely used, except for product search and academic search, it can also be applied to various vertical search fields such as yellow pages, blogs, people, job positions, restaurants, and air ticket searches. Its combination with e-commerce and derivation of new advertising forms, it will be a new topic.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.