標籤:
實驗一下Lucene是怎麼使用的。
參考:http://www.importnew.com/12715.html (例子比較簡單)
http://www.yiibai.com/lucene/lucene_first_application.html (例子比較複雜)
這裡也有一個例子:http://www.tuicool.com/articles/aqIZNnE
我用的版本比較高,是6.2.1版本,文檔查閱:
http://lucene.apache.org/core/6_2_1/core/index.html
首先在Intellij裡面建立一個Maven項目。名字為lucene-demo。(主要參考 http://www.importnew.com/12715.html )
其中pom.xml如下:
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.myapp</groupId> <artifactId>lucene-demo</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-core --> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-core</artifactId> <version>6.2.1</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-queryparser</artifactId> <version>6.2.1</version> </dependency> </dependencies></project>
講了一個package:com.myapp.lucene,裡面class LuceneDemo,內容如下:
package com.myapp.lucene;import org.apache.lucene.analysis.standard.StandardAnalyzer;import org.apache.lucene.document.Document;import org.apache.lucene.document.Field;import org.apache.lucene.document.StringField;import org.apache.lucene.document.TextField;import org.apache.lucene.index.DirectoryReader;import org.apache.lucene.index.IndexReader;import org.apache.lucene.index.IndexWriter;import org.apache.lucene.index.IndexWriterConfig;import org.apache.lucene.queryparser.classic.ParseException;import org.apache.lucene.queryparser.classic.QueryParser;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TopScoreDocCollector;import org.apache.lucene.store.RAMDirectory;import org.apache.lucene.store.Directory;import java.io.IOException;/** * Created by baidu on 16/10/20. */public class LuceneDemo { // 0. Specify the analyzer for tokenizing text. // The same analyzer should be used for indexing and searching static StandardAnalyzer analyzer; static Directory index; static void prepareDoc() throws IOException{ // 0. init analyzer analyzer = new StandardAnalyzer(); // 1. create index index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter w = new IndexWriter(index, config); addDoc(w, "lucence tutorial", "123456"); addDoc(w, "hi hi hi", "222"); addDoc(w, "ok LUCENCE", "123"); w.close(); } static void addDoc(IndexWriter w, String text, String more) throws IOException{ Document doc = new Document(); doc.add(new TextField("text", text, Field.Store.YES)); doc.add(new StringField("more", more, Field.Store.YES)); w.addDocument(doc); } static void search(String str) throws ParseException, IOException { // 2. query Query q = new QueryParser("text", analyzer).parse(str); // 3. search int listNum = 10; IndexReader reader = DirectoryReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); TopScoreDocCollector collector = TopScoreDocCollector.create(listNum); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; // 4. display System.out.printf("Found %d docs.\n", hits.length); for (int i=0; i<hits.length; i++) { int docId = hits[i].doc; Document doc = searcher.doc(docId); System.out.printf("Doc %d: text: %s, more: %s\n", i+1, doc.get("text"), doc.get("more")); } reader.close(); } public static void main(String[] args) { try { prepareDoc(); search("Lucence"); } catch (IOException e) { e.printStackTrace(); } catch (ParseException e) { e.printStackTrace(); } }}
然後運行,能夠成功:
Found 2 docs.Doc 1: text: lucence tutorial, more: 123456Doc 2: text: ok LUCENCE, more: 123Process finished with exit code 0
因為用的是RAMDirectory,所以應該沒有建立實際的目錄和檔案。
另外,代碼和邏輯中有幾點需要注意的地方:
注意,對於需要分詞的內容我們使用TextField,對於像id這樣不需要分詞的內容我們使用StringField。
編碼過程中,報過好幾次錯,關於Exception需要wrap或者throws的情況。
有些API的版本升級了,參數和以前不一樣。在實際的代碼中根據實際要求有所修改。一般都是簡化了。
Lucene的學習及使用實驗