Lucene initial experience

Source: Internet
Author: User

Lucene is a Java full-text search engine.

Download lucene-3.6.0.zip from the official website using lucene3.6 and decompress it.

Jar:

\ Lucene-3.6.0 \ lucene-core-3.6.0.jar ------> Lucene core package

\ Lucene-3.6.0 \ contrib \ analyzers \ common \ lucene-analyzers-3.6.0.jar ------> word Divider

\ Lucene-3.6.0 \ contrib \ highlighter \ lucene-highlighter-3.6.0.jar ------> highlight keyword usage

\ Lucene-3.6.0 \ contrib \ Memory \ lucene-memory-3.6.0.jar ------> highlight keyword usage

A simple example of simulating book search and writing.

1. Create a Java project and introduce the jar.

2. Compile a library object.

Package COM. cndatacom. lucene. entity;/*** book * @ author luxh */public class book {private integer ID;/*** title */private String title; /*** content */private string content;/*** author */private string author; // The getter/setter method is omitted //......}

3. Simple test.

Package COM. cndatacom. lucene. test; import Java. io. file; import Java. io. ioexception; import Java. util. arraylist; import Java. util. list; import Org. apache. lucene. analysis. analyzer; import Org. apache. lucene. analysis. standard. standardanalyzer; import org.apache.e.doc ument. document; import org.apache.e.doc ument. field; import org.apache.e.doc ument. field. index; import org.apache.e.doc ument. field. Store; import Org. apache. lucene. index. corruptindexexception; import Org. apache. lucene. index. indexreader; import Org. apache. lucene. index. indexwriter; import Org. apache. lucene. index. indexwriterconfig; import Org. apache. lucene. queryparser. multifieldqueryparser; import Org. apache. lucene. queryparser. parseexception; import Org. apache. lucene. queryparser. queryparser; import Org. apache. lucene. search. indexsearcher; Import Org. apache. lucene. search. query; import Org. apache. lucene. search. scoredoc; import Org. apache. lucene. search. topdocs; import Org. apache. lucene. search. highlight. formatter; import Org. apache. lucene. search. highlight. fragmenter; import Org. apache. lucene. search. highlight. highlighter; import Org. apache. lucene. search. highlight. invalidtokenoffsetsexception; import Org. apache. lucene. search. highlight. querysco RER; import Org. apache. lucene. search. highlight. scorer; import Org. apache. lucene. search. highlight. simplefragmenter; import Org. apache. lucene. search. highlight. simplehtmlformatter; import Org. apache. lucene. store. directory; import Org. apache. lucene. store. fsdirectory; import Org. apache. lucene. util. version; import Org. JUnit. before; import Org. JUnit. test; import COM. cndatacom. lucene. entity. book; public class lucen Etest {// analyzer private analyzer; // index storage directory private directory;/*** initialize analyzer and directory * @ throws ioexception */@ beforepublic void before () throws ioexception {// create a standard tokenizer // version. analyzer = new standardanalyzer (version. required e_36); // create a directory named indexdirfile indexdir = new file (". /indexdir "); // create an index directory directory = fsdirectory. open (indexdir);}/*** create an index file * @ th Rows ioexception */@ testpublic void testcreateindex () throws ioexception {// create an indexwriter configuration, specify the matched version, and indexwriterconfig = new indexwriterconfig (version. paie_36, analyzer); // create indexwriter, which is responsible for index creation and maintenance indexwriter = new indexwriter (directory, indexwriterconfig ); // obtain the book information book1 = New Book (); book1.setid (1); book1.settitle ("Java programming ideas"); book1.setauthor ("Bruce Eckel"); Bo Ok1.setcontent ("thinking in Java shocould be read cover to cover by every Java programmer, then kept close at hand for frequent reference. "); book book2 = New Book (); book2.setid (2); book2.settitle (" the eternal path of architecture "); book2.setauthor (" Alexander "); book2.setcontent ("the eternal path of architecture" introduces a new theory and idea about architectural design, architecture and planning, the core of this theory is that social members set the world order in which they live according to their own status of existence. This ancient method fundamentally forms the foundation of post-industrial architecture, these buildings are created by people. "); // Create documentdocument doc1 = new document (); // store specifies whether the field needs to be stored, and the index specifies whether the field needs to be segmented into the index doc1.add (new field (" ID ", book1.getid (). tostring (), store. yes, index. not_analyzed); doc1.add (new field ("title", book1.gettitle (), store. yes, index. analyzed); doc1.add (new field ("author", book1.getauthor (), store. yes, index. analyzed); doc1.add (new field ("content", book1.getcontent (), store. yes, index. analyzed); // create documentdocument doc2 = new document (); // store specifies whether the field needs to be stored, and index specifies whether the field needs to be segmented into the index doc2.add (new field ("ID ", book2.getid (). tostring (), store. yes, index. not_analyzed); doc2.add (new field ("title", book2.gettitle (), store. yes, index. analyzed); doc2.add (new field ("author", book2.getauthor (), store. yes, index. analyzed); doc2.add (new field ("content", book2.getcontent (), store. yes, index. analyzed); // Add the document to the index indexwriter. adddocument (doc1); indexwriter. adddocument (doc2); // submit the changes to the index and disable indexwriter. close ();}/*** search for books * @ throws parseexception * @ throws ioexception * @ throws corruptindexexception * @ throws failed */@ testpublic void testsearchbook () throws parseexception, corruptindexexception, ioexception, invalidtokenoffsetsexception {// search keyword string querykeyword = ""; // create a query analyzer and convert the query keyword into a query object (search in a single field) // queryparser = new queryparser (version. paie_36, "author", analyzer); // search string [] fields = {"title", "content"} in the author's index "}; // (search for multiple fileds) queryparser = new multifieldqueryparser (version. required e_36, fields, analyzer); query = queryparser. parse (querykeyword); // obtain the index access interface to search for indexreader = indexreader. open (directory); indexsearcher = new indexsearcher (indexreader); // topdocs search result returned by topdocs = indexsearcher. search (query, 100); // returns only the first 100 records int totalcount = topdocs. totalhits; // total number of search results system. out. println ("Total number of searched results:" + totalcount); scoredoc [] scoredocs = topdocs. scoredocs; // list of search results // create a highlighted player so that the keyword of the search is highlighted by formatter = new simplehtmlformatter ("<font color = 'red'> ", "</font>"); scorer fragmentscore = new queryscorer (query); highlighter = new highlighter (formatter, fragmentscore); fragmenter = new simplefragmenter (100); highway. settextfragmenter (fragmenter); List <book> books = new arraylist <book> (); // retrieves the search result and puts it into the set for (scoredoc: scoredocs) {int docid = scoredoc.doc; // the document number float score = scoredoc for the current result. score; // The correlation score of the current result system. out. println ("score is:" + score); document = indexsearcher.doc (docid); book = New Book (); book. setid (integer. parseint (document. get ("ID"); // highlight titlestring Title = document. get ("title"); string highlightertitle = highlighter. getbestfragment (analyzer, "title", title); // If the keyword if (highlightertitle = NULL) {highlightertitle = title;} book is not found in the title. settitle (highlightertitle); book. setauthor (document. get ("author"); // highlight contentstring content = document. get ("content"); string highlightercontent = highlighter. getbestfragment (analyzer, "content", content); // If the keyword if (highlightercontent = NULL) {highlightercontent = content;} book is not found in content. setcontent (highlightercontent); books. add (book);} // close indexreader. close (); indexsearcher. close (); For (Book: Books) {system. out. println ("book 'id is:" + book. GETID (); system. out. println ("book 'title is:" + book. gettitle (); system. out. println ("book 'author is:" + book. getauthor (); system. out. println ("book 'content is:" + book. getcontent ());}}}

 

  

 

  

 

     

  

  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.