Lucene & SOLR

Source: Internet
Author: User
Tags solr

From http://alartin.iteye.com/blog/42867 and http://www.iteye.com/blogs/tag/lucene

Lucene is a sub-project of the 4 Jakarta Project Team of the Apache Software Foundation. It is an open-source project.Full-text search engine ToolkitThat is, it is not a complete full-text search engine, but a full-text search engine architecture that provides a complete query engine and index engine, some text analysis engines (two Western languages: English and German ). Lucene aims to provide software developers with a simple and easy-to-use toolkit to conveniently implement full-text retrieval in the target system, or build a complete full-text retrieval engine based on this.

SOLR is an enterprise-level Search Server Based on Lucene Java library, including XML/HTTP, json api, highlighting query results, Faceted search (do not know how to translate, segment-based search), caching, copy also has a Web management interface. SOLR runs in the servlet container. Therefore, the essential differences between SOLR and Lucene are as follows: Search Server, Enterprise Level, and management. Lucene is essentially a search library, not an independent application, but SOLR is. Lucene focuses on the underlying construction of search, while SOLR focuses on enterprise applications. Lucene is not responsible for supporting the necessary management of search services, but SOLR is responsible. So here is a summary of SOLR:SOLR is an extension of Lucene for Enterprise Search applications.

In this article, Let's first look at what SOLR promises to us, or what SOLR claims.

SOLR is a search server with independent operations like WebService. You will be able to put documents into the search server in XML format through the HTTP protocol (this process is called index), you can query the search server through the http get protocol and get the results in XML format. SOLR features include:

    • Advanced full-text search
    • Optimized for High-throughput network traffic
    • Standards based on open interfaces (XML and HTTP)
    • Comprehensive HTML Management Interface
    • Scalability-effectively replicated to another SOLR search Server
    • Use xml configuration to achieve flexibility and adaptability
    • Scalable plug-in system
SOLR uses Lucene and extends it!
  • A real data schema with a dynamic field and a unique key)
  • Powerful extensions of Lucene query language!
  • Supports dynamic grouping and filtering of results
  • Advanced and configurable Text Analysis
  • Highly configurable and Scalable Cache Mechanism
  • Performance Optimization
  • Supports external configuration through XML
  • Has a Management Interface
  • Monitored logs
  • Supports fast incremental updates and snapshot Distribution)
Schema)
  • Define the domain type and document domain
  • Smart Processing
  • Declarative Lucene analyzer Specification
  • Dynamic domains can be added at any time
  • The copy domain function allows you to index a domain in multiple ways, or combine multiple domains into a searchable domain.
  • Explicit type can reduce speculation on domain types
  • The ability to use an external file-based termination word list, synonym list, and protection word list Configuration
Query
  • HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby)
  • Highlighted context search results
  • Faceted search)
  • Added sorting rules for query languages
  • Constant score range (constant scoring range) and prefix query-no IDF, coord, or lengthnorm factor, no limit on the number of words matching the query
  • Function query: affects the score by using functions related to the value or sequence of a domain.
  • Performance Optimization
Core
  • Pluggable query handler and scalable XML data format
  • Use a unique key to enhance document uniqueness
  • Efficient batch update and Deletion
  • User-configurable trigger for document index change (command)
  • Searcher with concurrency control
  • Able to correctly process numeric types, so as to be able to sort and search by range
  • Ability to control documents with missing sorting Fields
  • Supports dynamic grouping of search results
Cache
  • Configurable query results, filters, and File Cache instances
  • Pluggable cache implementation
  • Backend cache Hot Start: when a new searcher is turned on, you can configure a search to heat it up to prevent the first result from slowing down, the current searcher processes the current request (???).
  • Automatic background hot start: the most frequently accessed items in the cache of the current searcher are generated again in the new searcher, which can cache frequently queried results at a high speed when the indexer and searcher change.
  • Fast and small Filter Implementation
  • Supports automatic hot-start user-level caching
Copy
  • The ability to effectively publish the indexes changed during rsync Transmission
  • Use pull strategy to simplify the increase of searcher
  • Configurable release interval allows you to weigh the time series and cache usage.
Management Interface
  • Comprehensive statistics on Cache Usage, update, and query
  • The text analysis debugger displays the results of each phase of each analyzer.
  • Web-based query and debugging output: parse query output, detailed Lucene explain method, can explain why a document has a low score and is excluded from the results, etc.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.