Build RESTful base Storage service with open source Apache SOLR search engine
Objective
Search engines have become an integral part of our lives, and for many people it is already a must for everyday life. Whether it is Internet search engines, such as Google, Yahoo, or the private search engine that serves the enterprise, they have become the necessary means for us to obtain information. Search engine Excellent search performance has been, and still give us a more and more impressive, and search engine integration of the ability of a variety of resources also makes us surprised-whether it is a private database, or obscure file server, or even your desktop system. Search engine is so impressive, then as the most close to our human way of thinking of a service form, it is anywhere, fast for us to retrieve information, search engines can also do for us?
The question of the proposed
Imagine the Internet that's changing every day: every day, there will be more and more websites and Web pages to share the wisdom and wealth of the human race, and every moment there are Crawler of spiders of all sizes and search engines who visit each Internet sites, which, after reading and downloading Web pages, analyze them according to increasingly complex indexing techniques, and cache the results of the analysis to provide high-performance services for a wide variety of search requirements.
Each Web page in the Internet, in addition to its own content, has a variety of links to external resources, which typically contain supplemental, detailed descriptions, or reference resources that link the content of the source page, and the external resource pages themselves also contain links to other resources. The external link page content and the link source page content together, describe a topic. In fact, as long as every page on the Internet has links to external resources, you can access every Web page on the Internet, starting from which page you want to traverse all the links.
The internet stores a wide variety of resources, and search engines provide us with access to these resources, but this interface can now only provide retrieval services, then a simple question is natural: can search engines also provide us with storage services? If we turn the Internet upside down-imagine: if the search engine not only provides services to retrieve information on individual pages, it also provides services to store information on "Internet pages," Providing information storage services for each site, as a container for storing information, So is this an ideal distributed, almost unlimited, infrastructure storage service system?
Source of Inspiration
Let's look at the structure of the human brain first, although the present stage of human understanding of the structure and operation of the brain is not so comprehensive, but the most basic structure we still know: the human brain has hundreds of millions of neurons, each neuron may be with tens of thousands of neurons through the dendrites, axons connected. When an external stimulus signal is transmitted to a neuron, the stimulated neuron may pass the stimulus signal to other neurons associated with it, according to the threshold of the dendrites and axons, and so on, it is clear that this is a system similar to a chain reaction. Perhaps a simple external stimulus signal can cause countless neurons in the brain to produce a wide variety of signals that interact and influence each other in the process of transmission and interaction, which is probably the source of human intelligence. If each Web page in the Internet is viewed as a neuron in the brain, and the links to the outside of the Web page correspond to the connections between neurons (such as dendrites and axons), then the Internet can also be seen as a storage system for human intelligence.
The purpose of this paper
This paper makes a simple analysis of the method of using search engine as storage service, and constructs a basic storage service system based on the open source Apache SOLR project, and then combines a simple BLOG site example to explain the structure and use of the storage service. It is hoped that the ideas discussed in this paper will inspire the applications that need large scale and high scalability storage service urgently.
Basic storage service based on search engine
To illustrate the composition of search engine based storage services, it is necessary to understand the role changes of search engines first:
Figure 1. The role of search engines in traditional Web services