Simple Lucene instance

Source: Internet
Author: User

When I write an article, I feel that it is difficult to write the title. Sometimes I don't know what the name is. Anyway, I am writing some simple examples about Lucene.

Lucene is actually very simple. It mainly involves two things: Creating indexes and searching.
Let's take a look at some of the terms used in Lucene. I am not going to introduce them in detail here, but just click here-because there is a good thing in the world called search.

Indexwriter: One of the most important classes in Lucene. It is mainly used to add documents to the index and control the use of some parameters during the index process.

Analyzer: analyzer, mainly used to analyze various text encountered by search engines. Commonly used include standardanalyzer, stopanalyzer, and whitespaceanalyzer.

Directory: the location where the index is stored. Lucene provides two types of index storage locations: disk and memory. Generally, indexes are stored on disks. Correspondingly, Lucene provides two classes: fsdirectory and ramdirectory.

Document: Document is equivalent to a unit for indexing. any file that can be indexed must be converted to a document object for indexing.

Field: field.

Indexsearcher: it is the most basic search tool in Lucene. indexsearcher is used for all searches;

Query: Query. Lucene supports fuzzy query, semantic query, phrase query, and combined query, for example, termquery, booleanquery, rangequery, and wildcardquery.

Queryparser: a tool used to parse user input. You can scan user input strings to generate query objects.

Hits: After the search is complete, the search result must be returned and displayed to the user. Only in this way can the search be completed. In Lucene, the set of search results is represented by instances of the hits class.

I have explained a lot of terms above. Let's take a look at some simple examples:
1. Simple standardanalyzer test example

Java code
  1. PackageLighter.javaeye.com;
  2. ImportJava. Io. ioexception;
  3. ImportJava. Io. stringreader;
  4. ImportOrg. Apache. Lucene. analysis. analyzer;
  5. ImportOrg. Apache. Lucene. analysis. Token;
  6. ImportOrg. Apache. Lucene. analysis. tokenstream;
  7. ImportOrg. Apache. Lucene. analysis. Standard. standardanalyzer;
  8. Public ClassStandardanalyzertest
  9. {
  10. // Constructor,
  11. PublicStandardanalyzertest ()
  12. {
  13. }
  14. Public Static VoidMain (string [] ARGs)
  15. {
  16. // Generate a standardanalyzer object
  17. Analyzer aanalyzer =NewStandardanalyzer ();
  18. // Test string
  19. Stringreader sr =NewStringreader ("lighter javaeye COM is the are on ");
  20. // Generate a tokenstream object
  21. Tokenstream Ts = aanalyzer. tokenstream ("name", Sr );
  22. Try{
  23. IntI = 0;
  24. Token T = ts. Next ();
  25. While(T! =Null)
  26. {
  27. // Displays the row number in the secondary output.
  28. I ++;
  29. // Output the processed characters
  30. System. Out. println ("th" + I + "row:" + T. termtext ());
  31. // Get the next character
  32. T = ts. Next ();
  33. }
  34. }Catch(Ioexception e ){
  35. E. printstacktrace ();
  36. }
  37. }
  38. }
Package lighter.javaeye.com; import Java. io. ioexception; import Java. io. stringreader; import Org. apache. lucene. analysis. analyzer; import Org. apache. lucene. analysis. token; import Org. apache. lucene. analysis. tokenstream; import Org. apache. lucene. analysis. standard. standardanalyzer; public class standardanalyzertest {// constructor, public standardanalyzertest () {} public static void main (string [] ARGs) {// generate a standardan Alyzer object analyzer aanalyzer = new standardanalyzer (); // test string stringreader sr = new stringreader ("lighter javaeye COM is the are on "); // generate tokenstream object tokenstream Ts = aanalyzer. tokenstream ("name", Sr); try {int I = 0; token T = ts. next (); While (T! = NULL) {// The row number I ++ is displayed in the secondary output; // The output processed character system. out. println ("row" + I + ":" + T. termtext (); // get the next character T = ts. next () ;}} catch (ioexception e) {e. printstacktrace ();}}}

Display result:

Reference row 1st: lighter
Row 3: javaeye
Row 3: COM

Tip:
Standardanalyzer is a built-in "Standard analyzer" in Lucene. It can provide the following functions:
1. The original sentence is segmented by Space
2. All uppercase letters can be converted to lowercase letters.
3. Some useless words, such as "is", "the", "are", and all punctuation marks can be deleted.
Check the result and make a clear comparison with "new stringreader (" lighter javaeye COM is the are on.
The API is not explained here. For details, see the official Lucene documentation. Note that the code here uses the release E2 API, which is significantly different from version 1.43.

2. look at another instance and create an index to search

Java code
  1. PackageLighter.javaeye.com;
  2. ImportOrg. Apache. Lucene. analysis. Standard. standardanalyzer;
  3. ImportOrg.apache.e.doc ument. Document;
  4. ImportOrg.apache.e.doc ument. field;
  5. ImportOrg. Apache. Lucene. Index. indexwriter;
  6. ImportOrg. Apache. Lucene. queryparser. queryparser;
  7. ImportOrg. Apache. Lucene. Search. Hits;
  8. ImportOrg. Apache. Lucene. Search. indexsearcher;
  9. ImportOrg. Apache. Lucene. Search. query;
  10. ImportOrg. Apache. Lucene. Store. fsdirectory;
  11. Public ClassFsdirectorytest {
  12. // Index Creation Path
  13. Public Static FinalString Path = "C: // index2 ";
  14. Public Static VoidMain (string [] ARGs)ThrowsException {
  15. Document doc1 =NewDocument ();
  16. Doc1.add (NewField ("name", "lighter javaeye com", field. Store. Yes, field. Index. tokenized ));
  17. Document doc2 =NewDocument ();
  18. Doc2.add (NewField ("name", "lighter blog", field. Store. Yes, field. Index. tokenized ));
  19. Indexwriter writer =NewIndexwriter (fsdirectory. getdirectory (path,True),NewStandardanalyzer (),True);
  20. Writer. setmaxfieldlength (3 );
  21. Writer. adddocument (doc1 );
  22. Writer. setmaxfieldlength (3 );
  23. Writer. adddocument (doc2 );
  24. Writer. Close ();
  25. Indexsearcher searcher =NewIndexsearcher (PATH );
  26. Hits hits =Null;
  27. Query query =Null;
  28. Queryparser QP =NewQueryparser ("name ",NewStandardanalyzer ());
  29. Query = QP. parse ("lighter ");
  30. Hits = searcher. Search (query );
  31. System. Out. println ("Search/" lighter/"Total" + hits. Length () + "result ");
  32. Query = QP. parse ("javaeye ");
  33. Hits = searcher. Search (query );
  34. System. Out. println ("Search/" javaeye/"Total" + hits. Length () + "result ");
  35. }
  36. }
Package lighter.javaeye.com; import Org. apache. lucene. analysis. standard. standardanalyzer; import org.apache.e.doc ument. document; import org.apache.e.doc ument. field; import Org. apache. lucene. index. indexwriter; import Org. apache. lucene. queryparser. queryparser; import Org. apache. lucene. search. hits; import Org. apache. lucene. search. indexsearcher; import Org. apache. lucene. search. query; import Org. apache. lucene. store. fsdirectory; public class fsdirectorytest {// index Creation Path public static final string Path = "C: // index2"; public static void main (string [] ARGs) throws exception {document doc1 = new document (); doc1.add (new field ("name", "lighter javaeye com", field. store. yes, field. index. tokenized); document doc2 = new document (); doc2.add (new field ("name", "lighter blog", field. store. yes, field. index. tokenized); indexwriter writer = new indexwriter (fsdirectory. getdirectory (path, true), new standardanalyzer (), true); writer. setmaxfieldlength (3); writer. adddocument (doc1); writer. setmaxfieldlength (3); writer. adddocument (doc2); writer. close (); indexsearcher searcher = new indexsearcher (PATH); hits = NULL; query = NULL; queryparser QP = new queryparser ("name", new standardanalyzer ()); query = QP. parse ("lighter"); hits = searcher. search (query); system. out. println ("Search/" lighter/"Total" + hits. length () + "result"); query = QP. parse ("javaeye"); hits = searcher. search (query); system. out. println ("Search/" javaeye/"Total" + hits. length () + "result ");}}

Running result:

Java code
  1. Search for two results for "lighter"
  2. Search for one result in javaeye
Search for "lighter", 2 results in total, 1 result in javaeye.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.