Lucene initial experience

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Lucene is a Java full-text search engine.

Download lucene-3.6.0.zip from the official website using lucene3.6 and decompress it.

Jar:

\ Lucene-3.6.0 \ lucene-core-3.6.0.jar ------> Lucene core package

\ Lucene-3.6.0 \ contrib \ analyzers \ common \ lucene-analyzers-3.6.0.jar ------> word Divider

\ Lucene-3.6.0 \ contrib \ highlighter \ lucene-highlighter-3.6.0.jar ------> highlight keyword usage

\ Lucene-3.6.0 \ contrib \ Memory \ lucene-memory-3.6.0.jar ------> highlight keyword usage

A simple example of simulating book search and writing.

1. Create a Java project and introduce the jar.

2. Compile a library object.

Package COM. cndatacom. lucene. entity;/*** book * @ author luxh */public class book {private integer ID;/*** title */private String title; /*** content */private string content;/*** author */private string author; // The getter/setter method is omitted //......}

3. Simple test.

Package COM. cndatacom. lucene. test; import Java. io. file; import Java. io. ioexception; import Java. util. arraylist; import Java. util. list; import Org. apache. lucene. analysis. analyzer; import Org. apache. lucene. analysis. standard. standardanalyzer; import org.apache.e.doc ument. document; import org.apache.e.doc ument. field; import org.apache.e.doc ument. field. index; import org.apache.e.doc ument. field. Store; import Org. apache. lucene. index. corruptindexexception; import Org. apache. lucene. index. indexreader; import Org. apache. lucene. index. indexwriter; import Org. apache. lucene. index. indexwriterconfig; import Org. apache. lucene. queryparser. multifieldqueryparser; import Org. apache. lucene. queryparser. parseexception; import Org. apache. lucene. queryparser. queryparser; import Org. apache. lucene. search. indexsearcher; Import Org. apache. lucene. search. query; import Org. apache. lucene. search. scoredoc; import Org. apache. lucene. search. topdocs; import Org. apache. lucene. search. highlight. formatter; import Org. apache. lucene. search. highlight. fragmenter; import Org. apache. lucene. search. highlight. highlighter; import Org. apache. lucene. search. highlight. invalidtokenoffsetsexception; import Org. apache. lucene. search. highlight. querysco RER; import Org. apache. lucene. search. highlight. scorer; import Org. apache. lucene. search. highlight. simplefragmenter; import Org. apache. lucene. search. highlight. simplehtmlformatter; import Org. apache. lucene. store. directory; import Org. apache. lucene. store. fsdirectory; import Org. apache. lucene. util. version; import Org. JUnit. before; import Org. JUnit. test; import COM. cndatacom. lucene. entity. book; public class lucen Etest {// analyzer private analyzer; // index storage directory private directory;/*** initialize analyzer and directory * @ throws ioexception */@ beforepublic void before () throws ioexception {// create a standard tokenizer // version. analyzer = new standardanalyzer (version. required e_36); // create a directory named indexdirfile indexdir = new file (". /indexdir "); // create an index directory directory = fsdirectory. open (indexdir);}/*** create an index file * @ th Rows ioexception */@ testpublic void testcreateindex () throws ioexception {// create an indexwriter configuration, specify the matched version, and indexwriterconfig = new indexwriterconfig (version. paie_36, analyzer); // create indexwriter, which is responsible for index creation and maintenance indexwriter = new indexwriter (directory, indexwriterconfig ); // obtain the book information book1 = New Book (); book1.setid (1); book1.settitle ("Java programming ideas"); book1.setauthor ("Bruce Eckel"); Bo Ok1.setcontent ("thinking in Java shocould be read cover to cover by every Java programmer, then kept close at hand for frequent reference. "); book book2 = New Book (); book2.setid (2); book2.settitle (" the eternal path of architecture "); book2.setauthor (" Alexander "); book2.setcontent ("the eternal path of architecture" introduces a new theory and idea about architectural design, architecture and planning, the core of this theory is that social members set the world order in which they live according to their own status of existence. This ancient method fundamentally forms the foundation of post-industrial architecture, these buildings are created by people. "); // Create documentdocument doc1 = new document (); // store specifies whether the field needs to be stored, and the index specifies whether the field needs to be segmented into the index doc1.add (new field (" ID ", book1.getid (). tostring (), store. yes, index. not_analyzed); doc1.add (new field ("title", book1.gettitle (), store. yes, index. analyzed); doc1.add (new field ("author", book1.getauthor (), store. yes, index. analyzed); doc1.add (new field ("content", book1.getcontent (), store. yes, index. analyzed); // create documentdocument doc2 = new document (); // store specifies whether the field needs to be stored, and index specifies whether the field needs to be segmented into the index doc2.add (new field ("ID ", book2.getid (). tostring (), store. yes, index. not_analyzed); doc2.add (new field ("title", book2.gettitle (), store. yes, index. analyzed); doc2.add (new field ("author", book2.getauthor (), store. yes, index. analyzed); doc2.add (new field ("content", book2.getcontent (), store. yes, index. analyzed); // Add the document to the index indexwriter. adddocument (doc1); indexwriter. adddocument (doc2); // submit the changes to the index and disable indexwriter. close ();}/*** search for books * @ throws parseexception * @ throws ioexception * @ throws corruptindexexception * @ throws failed */@ testpublic void testsearchbook () throws parseexception, corruptindexexception, ioexception, invalidtokenoffsetsexception {// search keyword string querykeyword = ""; // create a query analyzer and convert the query keyword into a query object (search in a single field) // queryparser = new queryparser (version. paie_36, "author", analyzer); // search string [] fields = {"title", "content"} in the author's index "}; // (search for multiple fileds) queryparser = new multifieldqueryparser (version. required e_36, fields, analyzer); query = queryparser. parse (querykeyword); // obtain the index access interface to search for indexreader = indexreader. open (directory); indexsearcher = new indexsearcher (indexreader); // topdocs search result returned by topdocs = indexsearcher. search (query, 100); // returns only the first 100 records int totalcount = topdocs. totalhits; // total number of search results system. out. println ("Total number of searched results:" + totalcount); scoredoc [] scoredocs = topdocs. scoredocs; // list of search results // create a highlighted player so that the keyword of the search is highlighted by formatter = new simplehtmlformatter ("<font color = 'red'> ", "</font>"); scorer fragmentscore = new queryscorer (query); highlighter = new highlighter (formatter, fragmentscore); fragmenter = new simplefragmenter (100); highway. settextfragmenter (fragmenter); List <book> books = new arraylist <book> (); // retrieves the search result and puts it into the set for (scoredoc: scoredocs) {int docid = scoredoc.doc; // the document number float score = scoredoc for the current result. score; // The correlation score of the current result system. out. println ("score is:" + score); document = indexsearcher.doc (docid); book = New Book (); book. setid (integer. parseint (document. get ("ID"); // highlight titlestring Title = document. get ("title"); string highlightertitle = highlighter. getbestfragment (analyzer, "title", title); // If the keyword if (highlightertitle = NULL) {highlightertitle = title;} book is not found in the title. settitle (highlightertitle); book. setauthor (document. get ("author"); // highlight contentstring content = document. get ("content"); string highlightercontent = highlighter. getbestfragment (analyzer, "content", content); // If the keyword if (highlightercontent = NULL) {highlightercontent = content;} book is not found in content. setcontent (highlightercontent); books. add (book);} // close indexreader. close (); indexsearcher. close (); For (Book: Books) {system. out. println ("book 'id is:" + book. GETID (); system. out. println ("book 'title is:" + book. gettitle (); system. out. println ("book 'author is:" + book. getauthor (); system. out. println ("book 'content is:" + book. getcontent ());}}}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Lucene initial experience

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Lucene initial experience

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support