Lucene: Introduction to the Full-text search engine based on Java

Source: Internet
Author: User
Tags comparison

Lucene is a Java-based Full-text indexing kit.

Java-based Full-text indexing engine Lucene Introduction: About the author and the History of Lucene

Implementation of full-text search: A comparison of luene Full-text indexes and database indexes

A brief introduction to the mechanism of Chinese word segmentation: A comparison based on lexical library and automatic word segmentation algorithm

Introduction to specific installation and use: System Architecture Introduction and Demo

Hacking Lucene: Simplified Query Analyzer, implementation of deletion, custom ordering, extension of application interface

What else can we learn from Lucene?

Java-based Full-text indexing/retrieval engine--lucene

Lucene is not a complete full-text indexing application, but is a Java-written Full-text Indexing engine toolkit that can be easily embedded in a variety of applications to implement Full-text indexing/retrieval for applications.

Lucene's author: Lucene's contributor Doug Cutting is a senior Full-text indexing/retrieval expert who was a major developer of the V-twin search engine (one of Apple's Copland operating system's achievements), and later as a senior system architect in Excite, Currently engaged in research on some of the internet's underlying architectures. His goal for Lucene is to add full-text search capabilities to a variety of small and medium applications.

Lucene's History: Earlier published in the author's own www.lucene.com, later released at the end of sourceforge,2001 year to become the Apache Foundation Jakarta a subproject: http://jakarta.apache.org/lucene/

There are already a lot of Java projects using Lucene as its background full-text indexing engine, and more notable are:

Jive:web Forum System;

Eyebrows: Mailing list HTML archiving/browsing/querying system, the main reference document of this article "Thelucene Search engine:powerful, flexible, and free" is one of the main developers of the eyebrows system, And eyebrows has become the main mailing list archiving system for the Apache project at the moment.

Cocoon: XML-based Web publishing framework, the full text retrieval section uses the Lucene

Eclipse: Java-based open development platform, the Help section's Full-text indexing uses Lucene

For Chinese users, the most concerned question is whether they support full-text search in Chinese. But by introducing the structure of Lucene later on, you will learn that because of the good architecture design of Lucene, support for Chinese can only be achieved by extending the language lexical analysis interface.

The realization mechanism of full-text search

Lucene API Interface Design is more general, input and output structure is very similar to the database table ==> record ==> field, so many traditional applications of files, databases, etc. can be more easily mapped to the storage structure of Lucene/interface. Generally speaking, Lucene can be considered as a database system that supports Full-text indexing.

Compare Lucene and database:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.