Click here for text-based text search: Follow vertical search, which is a pseudo concept with low technical content? (Comment)

Source: Internet
Author: User

// P comment:

To put it bluntly. Baidu, Google, and Yahoo are all counted. Assume that the number of web pages they collect is reduced by 0.1 billion times. That is to say. We can assume that it only searches 10 webpages. We can draw a conclusion: these so-called searches only provide a inverted index for webpages. In layman's terms, the dictionary is indexed as the first part of the alphabet. There is no mystery at all. What's mysterious is: data volume, storage volume, cost, technical difficulty, and no authentication. There is also whether the business form is successful or not. Cool News, if more than Baidu Na, it is not impossible.
This involves information chunking. The plane and the vertical are actually the logic of things. We do not need to think that Google is a search. I did not search for 1000 websites.
The author of the original article lacks the necessary understanding of the industry, technology, and architecture.

=================== Original document ==================================

Google, China, imitating cool news, launched a life search, attracting media and users' attention to the so-called "vertical search. Some viewers believe that, after Baidu, Google, and Yahoo connect share nearly of the internet search market, latencies in the search market may be able to share a portion of the vertical search.

However, due to the fuzzy definition, vertical searches refer to different things in different people. Many search professionals can even pretend to venture capitalists that they are engaged in vertical searches ", unlike Baidu, Qian Jing is very bright.

In my opinion, apart from the limitations of the current search technology and vertical division by information media type, other vertical searches are not either contrary to the search intention or simply not a search.

What is search? Search is to find the content you need from a massive amount of Internet information. It has two features: massive information, and unstructured information, that is to say, the information is stored in discrete forms such as webpages, audios, and images, and is not stored in XML files that many IT companies dream.

As image and video recognition technologies are far from mature, audio regionalization is not yet popular (technology already exists). Today's search engines have to be classified into web search, video search, music, image search, and other types. With openv and other technologies, Audio Information in a video can be converted into searchable text. Therefore, a video is only one type of Webpage Search. For example, if you want to know the precautions for using a baby medication, if an expert gives a detailed answer on two sets of CCTV programs, this video is not what you want to see?

Media types cannot be classified, while other types of search by webpage content are superfluous. Searching is to search billions of webpages for massive information. If 5 billion webpages are divided into five categories and ten categories by Forum, Encyclopedia, and blog, then, five or ten search products are launched, and users are allowed to click five or ten times, which is totally contrary to each other and a waste of user time.

From this point of view, the so-called vertical search by content is a very misleading concept, reducing the search efficiency and endless harm.

Today, a so-called "Search" is also labeled as "vertical search", the most typical of which is train tickets and air tickets. In my opinion, these services are not just searches. The judgment is based on whether the queried information is massive or structured.

The number of trains and flights (including discount prices) is limited, and the frequency of change is one or two times a year. The amount of information is far from being available for searching. From the perspective of Chinese websites, there are countless websites that provide train tickets and ticket searches. The difference is not the level of retrieval technology, but whether database updates are timely.

From a technical point of view, it takes only a few working days to develop an SQL statement-based web query function after obtaining an authoritative database, which is quite different from the search by bringpage and Li Yanhong. Of course, it is also an illusion that an SQL developer thinks that Baidu has obtained a job.

The essential reason for the simple technology development is that the ticket information is structured and the retrieval process can rely on existing technologies without the need to develop extremely complex HTML text search technologies.

There are still many professional queries for train tickets, such as medical records, fugitive searches used by the police, and commodity searches on the online retail website. These are not modern search engines. vertical search may be a "trendy" search engine ".

To sum up, under the current generation of search technology, we are currently classifying the search based on the information media type. This classification is not called "vertical ". Many services that are called "vertical search" are actually "searches" with no technical content ".

It should be emphasized that the search target is to use complex algorithms and distributed computing technology to find out what users want most from massive unstructured information. If a late search operator does not have confidence in algorithms and computing, simply do not classify massive amounts of information as their own business, the progress of Google in the US and Baidu in China will sooner or later make this low-tech job useless.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.