Lucene Version: 7.1
Key points for using Lucene
Create a document, add a file (Field);
Add documents to IndexWriter;
Use Queryparser.parse () to build the query content;
Use the search () method of indexsearcher to make inquiries;
First, the basic process of creating an indexOpen a Directory, storing index filesFsdirectory refers to a folder that can be stored in
, scattered fairy recently in a project is also about our station search keywords of the click-through analysis, our entire station of log data, all recorded in Hadoop, the initial task of the scattered fairy and the significance of this task is as follows:(1) Find out the data from my station search(2) Analyzing the number of searches in a given period(3) Analyze the number of clicks of a keyword at a certain time(4) Through these data, find out some search without clicks, search with click, se
keywords, to assess the quality of our station search, to optimize the search scheme, and improve to provide some reference(6) Use Lucene or SOLR indexes to store analyzed data and provide flexible and powerful retrieval methodsThe specific use of pig analysis data process, the scattered fairy here is not fine, interested friends, can be in the public post-message consulting, today mainly look at, pig anal
SOLR's Chinese word breakerChinese word segmentation in SOLR is not enabled by default, we need to configure a Chinese word breaker. The current available word breakers have smartcn,ik,jeasy, Cook looked through. In fact, mainly two, one is based on the Chinese Academy of Sciences Ictclas Implicit Markov hmm algorithm, such as SMARTCN,ICTCLAS4J, the advantage is the high accuracy of the word segmentation, the disadvantage is that users can not use cus
: http://archive.apache.org/dist/lucene/solr/
Operating environment: WIN7,TOMCAT6, Solr4.3, Jdk6
Download the solr4.3 package and unzip it to a local folder, such as D:\apache\solr-4.3.0
find a folder as the Solr home folder. such as d:/solrhome
Copy D : \apache\solr
: http://archive.apache.org/dist/lucene/solr/
Operating environment: WIN7,TOMCAT6, Solr4.3, Jdk6
Download the solr4.3 package and unzip it to a local directory, such as D:\apache\solr-4.3.0
find a directory as the home directory of SOLR, such as D:/solrhome
Copy D : \apache\
1. SOLR Installation and configuration:1.1, environment configuration; TOMCAT6 + jdk1.6 + solr-4.7.2; Note: solr4.8 and above must be jdk1.7 above to compile correctly1.2. SOLR historical version download:http://archive.apache.org/dist/lucene/solr/2, the download down the
The SOLR server is developed using JAVA5 and is based on the Lucene Full-text search.
To build SOLR, first configure the Java environment, install the corresponding JDK, and Tomcat, which is not much to say here.
The following is the latest version of solr4.10.3 in jdk1.7 and tomcat1.7 environments.
The specific steps are as follows: 1. To the official website ht
Yesterday we learned about the Indexsearcher build process for Lucene search (http://blog.csdn.net/wuyinggui10000/article/details/45698667), Have a general understanding of Lucene's indexsearcher, know how to create indexsearcher, we should begin to learn to use Indexsearcher to index the search, In this section we learn the principles of indexing and the writing of tool classes that write indexed queries based on their related principles;Indexsearche
Early experience of Lucene and early experience of lucenePreface
Mongoe.net was first used for unstructured data search. It has been said that while learning java, lucene was involved. After more than two years, it finally began to work on this.
Development Environment
Idea2016, lucene6.0, and jdk1.8
Prerequisites for using lucene
1. pom. xml
2. test data. I too
Download the latest version of the Lucene source code from lucene.apache.org (currently 3.0.0), IDE I choose Eclipse, I do not know Java, but very much want to be able to see the bottom of the Lucene operating mechanism and some techniques. Download the package needs to be a suffix has src, I mainly want to see the source code content, so there is no download compiled binary package
From Eclipse's File->ne
Indexer:ImportOrg.apache.lucene.index.IndexWriter;ImportOrg.apache.lucene.analysis.standard.StandardAnalyzer;Importorg.apache.lucene.document.Document;ImportOrg.apache.lucene.document.Field;Importorg.apache.lucene.store.FSDirectory;Importorg.apache.lucene.store.Directory;Importorg.apache.lucene.util.Version;ImportJava.io.File;ImportJava.io.FileFilter;Importjava.io.IOException;ImportJava.io.FileReader;//From Chapter 1/*** This code is originally written for * Erik ' s
FL: A list separated by commas (,). It is used to specify the list to be returned in the document results.FieldSet. The default value is"*All fields.
Deftype: Specify query parser. Commonly Used deftype = Lucene, deftype = dismax, deftype = edismax
Q: Query.
Q. ALT: When the Q field is blank, it is used to set the default query. Generally, Q. ALT is set *:*.
QF: Query fields, which specifies the fields from which
word breakerStandardAnalyzer:Word participle: is to follow the Chinese word word by word. such as: "I love China",Effect: "I", "Love", "Zhong", "country".CjkanalyzerDichotomy: divide by two characters. such as: "I am Chinese", the effect: "I Am", "is Medium", "China" "Chinese".The top two word breakers do not meet the requirements.SmartchineseanalyzerGood support for Chinese, but poor extensibility, extended thesaurus, disable thesaurus, and other difficult to handle2. Third-party Chinese parse
Before 4.10 wrote a installation tutorial, is installed in Tomcat, in order to install 5.1, looked at the following introduction, found from 5.x after SOLR integrated jetty, installation has become a lot easier.Now only three steps can be done, download SOLR package decompression, decompression, start on the line, of course, have the JDK environment (more than 1.7 must)1, downloadsolr:http://www.apache.org/
.In the left half of the information that we generally call a dictionary , each string on the left points to the link to the right of the document, which is called the inverted table . The example of how the Xinhua dictionary corresponds to a reverse index is a reflection of this.How to create an indexFor the creation of an index, I have summed up a three-step process: The data to be retrieved (Document), word breaker (Analyzer), Index Build (Indexer), can be easily referenced:Let's take a simp
Recently, I have improved the search program in the blog garden. The search function of the blog Park uses the mongoe.net search engine. When the search function was added to the blog Park, mongoe.net did not support Chinese word segmentation. Later, I got help from http://www.cnblogs.com/yuhen/to answer this question. (A problem has occurred in the search program of the blog site recently. google is used for the moment ).
Currently, word segmentation is supported in Alibaba e.net. I downloaded
First, reference Lucene in the project. net. dll and then create an index (index creation page. aspx View Code
Protected void btnNew_Click (object sender, EventArgs e){IList String indexDir = ConfigurationManager. AppSettings ["indexDir"]; // The path where the index is stored in web. config.Analyzer analyzer = new StandardAnalyzer (global: Lucene. Net. Util. Version. paie_29 );IndexWriter writer = new Ind
Lucene-based case development: Collection of the overview page of the vertical and horizontal novels, lucene case
Reprinted please indicate the source: http://blog.csdn.net/xiaojimanman/article/details/44851419
Http://www.llwjy.com/blogdetail/1b5ae17c513d127838c2e02102b5bb87.html
The personal blog website has been launched. Its website is www.llwjy.com ~ Thank you ~Bytes ------------------------------------
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.