The Apache SOLR: Overview of the location of SOLR in the Information system architecture

Last Update:2016-12-17 Source: Internet

Author: User

Tags apache solr solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview:Apache SOLR is an open-source, enterprise-level search platform built on Apache Lucene projects in the Java language. Key features include: full-text search, hit highlighting, fragment search, real-time indexing, dynamic clustering, database integration, NoSQL features, and rich Text processing. Provides distributed search and index replication, which is designed with extended and fault-tolerant capabilities in mind. SOLR is now the second most popular enterprise-class search engine, the first of which is elasticsearch. SOLR runs as a standalone full-text Search server. Use the Java-developed Lucene internally to complete full-text indexing and querying, providing restful APIs to complete support for most programming languages. The flexible external configuration allows you to do the work without writing any Java code, while also providing a plug-in architecture to support more advanced user customization. Since it's so powerful, what position is it in our overall platform? Positioning:An example from the Official Handbook. , SOLR runs outside of other server applications. For the warehouse platform system, we want to provide some user interface: for example, can initiate the storage interface, you can view the inventory interface, you can launch the library interface, as a library tube, may also need to adjust the incorrect material information. Regardless of warehousing, out of the library, check the inventory and other functions, are around the material to expand. This information will be present in both the database of the platform system and the SOLR system, except that there may be (or will not be) differences in the format and completeness of the information due to the different purposes and usefulness of storing it in various systems. With SOLR, we've made it easier to enhance our search experience in the warehousing platform. Just use the following steps: 1. Define the schema. The schema informs Solr about the content of the file that will be indexed. As an example of a storage platform, the schema may need to define fields for material names, codes, inventory quantities, manufacturers, etc. SOLR's schema is powerful and resilient, and allows you to define specific SOLR behavior for your application system. 2, Release SOLR. 3. Provide SOLR files for the search to be retrieved by the user. 4, in the application to implement the search function. SOLR is built on development standards and is therefore highly scalable. SOLR's query is based on restful, that is, the essence of a query is a simple HTTP request URL and a structured response document. The structure of the response document mainly includes: XML, JSON, CSV, and other formats. This also means that a large number of customer applications can use SOLR, such as Web applications, rich-client applications, and mobile devices. Any platform that supports the HTTP protocol can interact with SOLR. SOLR is based on the Apache Lucene project, a high-performance, full-feature search engine. SOLR supports simple keyword queries, complex multi-field queries, and fragmentation of results. Scalability:If the ability of a single SOLR is not significant enough, its ability to handle a very large number of applications will achieve the desired result. The more common scenario is that you have a lot of data or a lot of queries, and a single SOLR server can't handle all of the workloads. In this case, you can use Solrcloud to extend the capabilities of SOLR so that it can better distribute data and process requests across multiple servers. A number of different configuration options need to be used in combination, based on the scalability you need to get. For example, a shard is an extension that divides a large collection into multiple logical blocks called shards, increasing the number of documents in a collection so that the physical limits allowed by a single SOLR server are exceeded. Queries that enter the system are distributed to each shard in the collection, and then the results of the merge are returned. Another technique that is available is to elevate the "replication Factor" of the collection, which allows you to add additional servers using a copy of the collection to complete the work by propagating high concurrent query commands across multiple machines. Fragmentation and replication are not mutually exclusive, and combined use enables SOLR to become a more powerful and extensible platform.

The Apache SOLR: Overview of the location of SOLR in the Information system architecture

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More