Lucene (01), javase01

Last Update:2016-10-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

My blog address: http://www.cnblogs.com/tenglongwentian/

Lucene: the latest version is javase6.2.1, and the matching jdk version is the official version 1.8.
The last jdk7 version is used here, So javase5.3.3 is used.

Create a maven project. If you do not know how to create a maven project, refer to the previous blog post.
<Packaging> jar </packaging>,

 1 <dependencies> 2         <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-core --> 3         <dependency> 4             <groupId>org.apache.lucene</groupId> 5             <artifactId>lucene-core</artifactId> 6             <version>5.5.3</version> 7         </dependency> 8         <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-queryparser --> 9         <dependency>10             <groupId>org.apache.lucene</groupId>11             <artifactId>lucene-queryparser</artifactId>12             <version>5.5.3</version>13         </dependency>14         <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-analyzers-common -->15         <dependency>16             <groupId>org.apache.lucene</groupId>17             <artifactId>lucene-analyzers-common</artifactId>18             <version>5.5.3</version>19         </dependency>20     </dependencies>

Because I use jdk 7 and do not like to manually adjust the jdk version of the project every time I update the maven repository

1 <! -- Source code directory, plug-in management, and other configurations --> 2 <build> 3 <finalName> Lucene </finalName> 4 <plugins> 5 <plugin> 6 <groupId> org. apache. maven. plugins </groupId> 7 <artifactId> maven-compiler-plugin </artifactId> 8 <version> 3.3 </version> 9 <configuration> 10 <! -- Specify the source and target versions --> 11 <! -- Source specifies the version of the compiler used to compile the java source code --> 12 <source> 1.7 </source> 13 <! -- The class file generated by target is compatible with the virtual machine of the specified version. --> 14 <target> 1.7 </target> 15 </configuration> 16 </plugin> 17 </ plugins> 18 </build>

Yes.

Create two classes:

Indexer

Import java. io. file; import java. io. fileReader; import java. nio. file. paths; import org. apache. lucene. analysis. analyzer; import org. apache. lucene. analysis. standard. standardAnalyzer; import org.apache.e.doc ument. document; import org.apache.e.doc ument. field; import org.apache.e.doc ument. textField; import org. apache. lucene. index. indexWriter; import org. apache. lucene. index. indexWriterConfig; import org. apache. lucene. store. directory; import org. apache. lucene. store. FSDirectory; public class Indexer {private IndexWriter writer; // write index instance/*** constructor instantiate IndexWriter ** @ param indexDir * @ throws Exception */public Indexer (String indexDir) throws Exception {Directory dir = FSDirectory. open (Paths. get (indexDir); Analyzer analyzer = new StandardAnalyzer (); // standard analyzer IndexWriterConfig iwc = new IndexWriterConfig (Analyzer); writer = new IndexWriter (dir, iwc );} /*** close write index ** @ throws Exception */public void close () throws Exception {writer. close ();}/*** index all files in the specified directory ** @ param dataDir * @ throws Exception */public int index (String dataDir) throws Exception {File [] files = new File (dataDir ). listFiles (); for (File f: files) {indexFile (f);} return writer. numDocs ();}/*** specify the index File ** @ param f */private void indexFile (File f) throws Exception {// TODO Auto-generated method stub System. out. println ("index file:" + f. getCanonicalFile (); Document doc = getDocument (f); writer. addDocument (doc);}/*** get the Document. In this Document, set each field ** @ param f * @ return * @ throws Exception */private Document getDocument (File f) throws Exception {// TODO Auto-generated method stub Document doc = new Document (); doc. add (new TextField ("contents", new FileReader (f); doc. add (new TextField ("fileName", f. getName (), Field. store. YES); doc. add (new TextField ("fullPath", f. getCanonicalPath (), Field. store. YES); return doc;} public static void main (String [] args) {String indexDir = "E: \ lucene"; String dataDir = "E: \ lucene \ data "; Indexer indexer = null; int numIndexed = 0; long start = System. currentTimeMillis (); try {indexer = new Indexer (indexDir); numIndexed = indexer. index (dataDir);} catch (Exception e) {// TODO Auto-generated catch block e. printStackTrace ();} finally {try {indexer. close ();} catch (Exception e) {// TODO Auto-generated catch block e. printStackTrace () ;}long end = System. currentTimeMillis (); System. out. println ("index:" + numIndexed + "files," + (end-start) + "millisecond ");}}

String indexDir = "E: \ lucene"; String dataDir = "E: \ lucene \ data ";
Do not be curious when you see this. The drive letter is random. Create a folder under the root directory of any drive letter. It is best to have no space in English and the Chinese language is not tested. Then copy a few txt files to the data folder, it will be used for testing later.
Run this class and you can see



Then we can see these strange files in the lucene folder. What will be mentioned later.

Create another class:

Searcher

1 import java. nio. file. paths; 2 3 import org. apache. lucene. analysis. analyzer; 4 import org. apache. lucene. analysis. standard. standardAnalyzer; 5 import org.apache.e.doc ument. document; 6 import org. apache. lucene. index. directoryReader; 7 import org. apache. lucene. index. indexReader; 8 import org. apache. lucene. queryparser. classic. queryParser; 9 import org. apache. lucene. search. indexSearcher; 10 import org. apache. lucene. search. query; 11 import org. apache. lucene. search. scoreDoc; 12 import org. apache. lucene. search. topDocs; 13 import org. apache. lucene. store. directory; 14 import org. apache. lucene. store. FSDirectory; 15 16 public class Searcher {17 public static void search (String indexDir, String q) throws Exception {18 Directory dir = FSDirectory. open (Paths. get (indexDir); 19 IndexReader reader = DirectoryReader. open (dir); 20 IndexSearcher is = new IndexSearcher (reader); 21 Analyzer analyzer = new StandardAnalyzer (); 22 QueryParser parse = new QueryParser ("contents", analyzer ); 23 Query query = parse. parse (q); 24 long start = System. currentTimeMillis (); 25 TopDocs hits = is. search (query, 10); 26 long end = System. currentTimeMillis (); 27 System. out. println ("matching" + q + ", total cost" + (end-start) + "millisecond," + "found" + hits. totalHits + "records"); 28 for (ScoreDoc scoreDoc: hits. scoreDocs) {29 Document doc = is.doc(scoreDoc.doc); 30 System. out. println (doc. get ("fullPath"); 31} 32 reader. close (); 33} 34 35 public static void main (String [] args) {36 String indexDir = "E: \ lucene "; 37 // String q = "LICENSE-2.0"; 38 String q = "Zygmunt Saloni"; 39 try {40 search (indexDir, q); 41} catch (Exception e) {42 // TODO Auto-generated catch block43 e. printStackTrace (); 44} 45} 46}

Run this class,

Do not delete several special files generated by the first class. If you are willful, try it and an error will be reported, if you delete several special files generated by the first class and run the second class, an error is returned.

Let's try it out.

Compared with String q = "Zygmunt Saloni", it turns out that it has no effect because of Word Segmentation and overall cutting.

Add-if you run the second class, the result will be the same. Try it yourself.

Please indicate the source for reprinting. Thank you.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Lucene (01), javase01

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Lucene (01), javase01

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support