Intermediary transaction SEO diagnosis Taobao guest Cloud host technology Hall
If there is any invention to save the Internet? That must be the search engine. What was the early form of the search? How many times have the search experience changed? Let's summarize the history of search engine development and find the context.
If there is any invention to save the Internet? It must be a search engine, otherwise the more information on the Internet, the faster it crashes, because people find the information they need and the more difficult it is to use the experience. What was the early form of the search? How many times have the search experience changed? What will the search engine be like in the future? Summarize the history of search engine development, find the context.
In fact, the search needs-from a lot of things (mainly information) to find their own, human has always been, only in the development of IT technology, all the information is not digitized, the only viable form of search is a paper directory, index, phone book. After the wide area network generation, the search demand exists, but the technology does not correspond to the rapid development, so the Internet search is the earliest form of the Web site book. The specific form and phone book, Yellow pages similar, record a lot of well-known web site of a book, size depending on the professional degree. I have bought a common netizen to apply, size thick similar to a Xinhua dictionary, according to different categories of site content.
The paper has, the network version quickly follow. 1994, Jerry Yang created Yahoo, and began to manually collect all kinds of web sites, and they are classified according to certain rules, ranking, netizens can only remember Yahoo's Web site, and then through the Yahoo into various categories of websites, the paper Web site book immediately become redundant. Some internet industry personage collects the website which the Yahoo uses artificially to collect and classify to present the catalog type search is called the first generation search engine, also some internet experts think Yahoo such practices cannot be strictly called the search engine, but should count as the earliest website navigation. The author tends to count it as a form of search implementation, even including Web site navigation.
But Yahoo, after all, just moved the paper catalogue to the Internet Web page, the visual search and different people's understanding of the classification of the site to reduce the efficiency of such searches. So the function of automatic lookup based on keyword is also applied to search engine, this is not difficult to realize, because the technology of Full-text search based on keyword is even early in the 50 's that the computer has just been invented (Chinese Full-text search technology was first as part of the 748 project, In the late 80 's the basic completion, but is widely used in the 90 's.
The only problem with the first generation search engine is that the Web site is still artificially collected, inefficient, error-prone, and incomplete. So the internet is in dire need of a technology that replaces manual web sites, and when it comes to substitution, people are bound to think of robots, so the second-generation search engine relies on the robot, casting and assembling the robot in the Internet, and now it is known as a search crawler or search engine spider. In fact, this technology has appeared earlier than Jerry Yang's Yahoo, even before the birth of the World Wide Web.
1990 University of Montreal student Alan Emtage invented Archie. Although the World Wide Web has not yet appeared, but the network file transfer is quite frequent, and because a large number of files scattered in each of the scattered FTP host, the query is very inconvenient, so Alan Emtage thought of the development of a file name can be found in the system, So there is Archie. Archie work with the current search engine is very close, it relies on the script to automatically search the files on the web, and then index the information for users to a certain expression query. Inspired by the popularity of Archie, the Nevada system Computingservices University in the United States developed another very similar search tool in 1993, but the search tool has been able to retrieve web pages in addition to index files.
Now the mainstream search engine: Google, Bing, Baidu and other search crawler crawl, download the Web page to replace the artificial, these search reptiles every certain number of days (such as Google is 28 days) to carry out a full internet crawl, all the results of the Web page download to their own servers, waiting for people to enter the keyword by entering the search application.
The robot crawls the Web page the work efficiency is obviously higher than the artificial, plus uses the keyword to carry on the retrieval, the new generation search engine's debut time should be earlier than the catalog type search and the website navigation is right. But the problem is: there is too much information on the Internet, search crawler to get back to the page, people can hardly be classified again, but only through the keyword search, people still have to mess from the naked eye to find their content, the use of the experience is not as good as direct use of the directory.
The solution to this problem is the most powerful in today's search field and one of the greatest companies in the world-Google. After Yahoo's success in the late 90, when people saw the huge demand for search, Larry Page and Sergey Brin, who was studying at Stanford University, developed the PageRank algorithm, Used to measure how important a particular Web page is to other pages in the search engine index. This algorithm is basically understood as voting, the most important part is to calculate the number of links between each page and other pages, the link to a search results of the more pages and the higher the weight, then the more important the search results. Google used this method to solve the problem of sorting the search results, replacing the catalog classification, but also using search crawler plus PageRank method to replace the Yahoo's first proposed search engine solution. Some people in the industry to Google as the representative of this generation of search engine called the second-generation search engine, some people think that this is the true meaning of the search engine, the author compared to support the previous version.
China's search engine history is basically directly from the second generation of search engines to start, Time is 1999, Baidu, search engines and other old search engine manufacturers from the beginning to use the search crawler and the combination of sorting algorithm (then there are 3721 to provide Web site navigation services, but the time with Baidu, search, etc. almost coincide). And Google, Yahoo different, at that time Baidu, search, are mainly for the portal to provide searching technology background services, and do not have their own rendering site. Until Google and Yahoo entered China at the beginning of this century, Baidu, search and later search, Sogou and then 360 before starting to have their own search engine sites.
History seems to be over, but the latest point is ten years from now, and the search engine is not the same in the last ten years.
The search engine crawler algorithm mentioned above can only solve the web search function, now all the search spiders in the world can only be used for a long time (more than 20 days) to achieve a full web crawl, for the updated frequency of a slightly slower web page, this speed is reasonable. But for news in the fast-growing Internet, this approach is too unwieldy. Some industry insiders believe that with the search technology and the internet speed is increasing, the problem will naturally be resolved, but the fact that the web search has not been able to undertake the search for news work, now people through the special news search technology to find their own to see the news.
China's first web portal to provide news search technology services are in search, Time is 2003 years. They limit the search crawler, which has been crawling web content to a handful of hundreds of selected news sources, to dramatically reduce the seemingly limitless internet, with all the time turning from days to minutes or even dozens of seconds. And once the news source itself changes, just add it or eliminate the range of your own selected news source. This technique is somewhat similar to the once-hot RSS-reading technology, but the latter is shrinking as the source of the information needs to conform to the RSS format, and Google's RSS-reading product Greader officially stopped serving in the summer of 2013. In addition, the ranking rules of news search are slightly different, pay more attention to time, relevance, release media and so on weight.
Similar to news search, special search techniques for special category information include image search, video search, parity search, etc. In addition, because the Internet information is too large, universal search is difficult to all information is professional, accurate, timely, so some specific industries or areas of vertical search also emerged. The principle is mostly similar to news search: Narrow the scope of the search reptile activity, and then modify the collation appropriately.
The search for the domestic and even the entire technology contribution is still, the first attempt to search the more advanced form-Personal portal, 2004, they released personal information portal browser, English abbreviation is pig, so also known as Network Pig.
The reason that the personal portal is called a more advanced form of search, because the previous search engines are passive and wait for people to actively input keyword search applications, and can make the search passive waiting for the initiative to provide services to the way is the personal portal. If the search always waits for the user to enter a keyword, it is always difficult to get rid of the tool's role, and the difference between the directory and the phone book is only between form and efficiency. In addition, the initiative to provide users with services can be more attention, use, to obtain more advertising revenue. Therefore, active and passive, not only a service form of the problem.
Portal site as the name suggests, is to provide users with the maximum amount of information to solve the most Internet demand "supermarket", but the front if the addition of individuals, the main appeal on the comprehensive and added to the precision. It seems that the entire Internet can only use keywords to search for searches to provide comprehensive and accurate information services. In search of the practice is to allow users to subscribe to their own search keywords, and then free combination into a home page, all the search results subscribe to the first time to give a new browser to open the Internet.
After this, Google also launched its own personal homepage product--igoogle, and the function is richer (added weather, stock, etc.). But personal portal products are not as successful as traditional search engines, at least in the desktop internet, and "network pigs" and igoogle are not the ideal results for search vendors, who also stopped serving as Greader in the winter of 2013. Other attempts to offer search services to internet users include Yahoo, which also allows netizens to subscribe to search keywords, and then send updates to the user's mailbox every day.
China's domestic search for innovation also has to mention Baidu's bid ranking mechanism: eager to promote their own enterprises in accordance with their own search results of the number of clicks paid to search engine manufacturers, the enterprise's promotional information appears in the search results, by a single click Pay High or low to determine the outcome of the ranking (pay higher than before). Although the industry has been blamed, but this mechanism to solve the problem of search engine manufacturers to eat, so can get rid of other sites to provide back-office service role, while the beginning of the profiteering also attracted more players to follow up into the search engine market, promoting the technology, market prosperity.
But the above attempts are based on the second-generation search engine, regardless of category, display form or profit model. Although this generation search engine uses the search crawler to solve the huge and comprehensive demand for search results, it is impossible to achieve complete precision by using only keyword and PageRank sort of methods. In both English and Chinese, the same keyword has many meanings. And a good sort of way will not be all the results that everyone really needs on the previous pages, everyone's search results may appear on the 100th page, 1000 pages or even 10,000 pages, because the Internet information is really too much, And there may be repeated information.
To the next generation of search engine has begun, the 2011 domestic search engine Manufacturers in search of the third generation of search engine platform, was the first to play the third generation search flag. The reason why the search claims to be the third generation is: distinguished from the first generation of purely artificial collection search results and complete second search crawler crawl results, their search engine using the combination of human and computer methods: that is, using search Crawler to continue to collect Web pages to solve the problem of the amount of search results, but by manual search results are sorted, sorted, Solve the accuracy of the search results. The previous author has said this is an impossible task, in search of the solution is to allow each netizen to participate in the process, they will be the entire search open, anyone on the search results have different views, there are different ideas can be proposed changes, different from Baidu users can only accept search results. Search results in the way the presentation has also changed, become a keyword for a similar portal topic of a multiple-frame page (different from other search engine directory structure), the same keyword different meanings have a completely different topic page rendering.
Since then, a large number of domestic "third-generation search" to follow, but regardless of good or bad, its search results collection, presentation method is not as in search, and the existing second-generation search engine has any significant differences, claiming that "third generation" is groundless.
In 2012, Google also announced the introduction of a knowledge map, similar to the way the search is presented, but also has a strong ductility, will be related to the key words in the sidebar. At the beginning of 2013 Baidu also made similar adjustments, but these are implemented in a technical way, not to add artificial. Google's more important next-generation search attempts include moving search into specialized hardware--Google glasses, while it is not yet certain whether it will be successful, the direction indicated is clear: future searches are closer to people's lives and may not be limited to text-input requests and expression results, nor to the 2-dimensional world.
But for the general public, the more realistic attempt now is the innovation of mobile search. Or the search, the third generation of search to migrate to the end of the mobile, they have to resume the personal portal. At the end of 2013, the search released in search of Yue Mobile personal portal, in addition to search, news and other functions, but also added a Web site navigation, application stores, third-party evaluation, life services, such as multiple search on the mobile end may achieve the main functions, and the previous personal portal, in search Yue can also accept the user's subscription, and actively present the search results of the update, more active is that it can use the mobile internet to push to the user.
Author: Li Yu public Number: Yinghuanlee