I. Introduction of Lucene
1. About Lucene
The most popular open source full-Text search engine Development toolkit for Java . Provides a complete query engine and indexing engine, partial text word breaker (English and German two Western languages). Lucene's goal is to provide software developers with an easy-to-use toolkit to facilitate full-text retrieval in the target system, or to build a complete full-text search engine on this basis. is Apache sub-project, URL: http://lucene.apache.org/
2. Lucene uses
Provide software developers with an easy-to-use toolkit to facilitate full-text indexing in target systems, or to build a complete full-text search engine on this basis.
3. Lucene Application Scenario
Provides full-text retrieval implementations for data in the database in your app.
Development of independent search engine services, systems
4. Characteristics of Lucene
1, stable, high index performance
Can index more than 150GB of data per hour.
Low memory requirements--only 1MB of heap memory is required
Incremental indexes are as fast as bulk indexes.
The size of the index is approximately 20%~30% of the index text size.
2. Efficient, accurate and high-performance search algorithm
Good sort of search.
Powerful Query method support: Phrase query, wildcard query, proximity query, scope query, and so on.
Support for field searches (such as title, author, content).
Can be sorted by any field
Supports multiple indexed query results merging
Support for update operations and query operations at the same time
Support highlighting, join, grouping result functions
Fast speed
Extensible sorting module with built-in vector space model, BM25 model optional
Configurable storage Engine
3. Cross-platform
Written in pure java.
As an open source project under the Apache Open Source license, you can use it in commercial or open source projects.
Lucene is available in multiple languages (e.g. C, C + +, Python, etc.), not just java.
Second, Lucene architecture
1. Data collection
2. Create an index
3. Index Storage
4. Search (using index)
Three, Lucene integration
1. Selected Lucene version
Select the current latest version of 7.3.0:https://lucene.apache.org/
2. System Requirements
Version JDK1.8 and above
3. Integration: Bring Lucene core jars into your application
Way one: Download zip, unzip and copy jar to your project
Way two: Maven introduces dependency
4. Lucene Module Description
Core:lucene Core Library module: participle, index, query
analyzers-*: Word breaker
facet:faceted indexing and search capabilities provides categorical indexes, search capabilities
Grouping:collectors for grouping search results. Search Results Grouping support
Highlighter:highlights search keywords in results keyword highlighting support
Join:index-time and Query-time joins for normalized content connection support
Queries:filters and queries that add to core Lucene supplemental query, filtering method implementation
Queryparser:query parsers and parsing framework query expression parsing module
Spatial:geospatial Search geolocation supports suggest:auto-suggest and spellchecking support spell checking, Lenovo hints
5. First introduce the core module of Lucene
<!--Lucene Core Module -<Dependency> <groupId>Org.apache.lucene</groupId> <Artifactid>Lucene-core</Artifactid> <version>7.3.0</version></Dependency>
6. Understanding the composition of the core module
Search Engine Series Two: Lucene (lucene Introduction, Lucene architecture, Lucene integration)