1. Create a folder named S on drive C in windows, and create three TXT files in the folder, named "1.txt ", "2.txt" and" 3.txt"The content of 1.txt is as follows:Java code
People's Republic of China
People all over China
2006
The content of "2.txt" and" 3.txt" can also be written in a few minutes. If you are writing it in a lazy way, just copy the content of a 1.txt file.
2. Download The Lucene package and put it in the classpath path.Index cre
Recently, Lucene. Net was used for a full-text search. The Lucene. Net 1.9 version was used at the beginning. There was no problem in querying by keyword. Join the query by time range, and no data was found. The Lucene statements are retrieved directly from the Luke tool and are not recorded.I sent an email yesterday asking for the rain mark RainTrail (http://www
Lucene is a high-performance Java full-text retrieval toolkit that uses the Inverted File index structure. This structure and corresponding generationAlgorithmAs follows:
0) There are twoArticle1 and 2Article 1: Tom lives in Guangzhou, I live in Guangzhou too.The content of article 2 is: he once lived in Shanghai.
1) Because Lucene is based on keyword indexing and query, we need to obtain the keywords o
Lucene is a very good full-text retrieval of open Source Library, the latest version is lucene4.4, about Lucene's historical background and development situation, here I do not introduce, if you really wish to learn lucene, presumably before you have done some understanding of this.There are a lot of people who know Lucene or SOLR, but there are very few people w
Original link http://www.cnblogs.com/dewin/archive/2009/11/24/1609905.htmlLucene is a high-performance Java full-text retrieval toolkit that uses inverted file index structures.The structure and the corresponding generating algorithm are as follows: 0) with two articles 1 and 2Article 1 of the content is: Tom lives in Guangzhou,i live in Guangzhou too.Article 2 of the content is: He once lived in Shanghai. 1) Since Lucene is based on the keyword ind
Search can be divided into the following steps:
Create directory
Create Indexreader
Create indexsearch based on Indexreader
Create a search query
Search by searcher and return Topdocs
Get Scoredoc objects based on Topdocs
Get specific document objects based on searcher and Scoredoc objects
Get the values you want based on the Document object
Here is the example code:version 3.5:3.5 version is relatively simple, only the
millions of pages of the page to do an index, It is like the catalogue and label of a book. Readers want to see which topic related chapters, directly according to the table of contents to find the relevant page. No more from the first page of the book to the last page, one page of the search.2.Lucene Inverted Index principleLucerne is an open-source, high-performance Java full-text Search Engine Toolkit , not a full-text search engine, but a full- t
, scattered fairy recently in a project is also about our station search keywords of the click-through analysis, our entire station of log data, all recorded in Hadoop, the initial task of the scattered fairy and the significance of this task is as follows:(1) Find out the data from my station search(2) Analyzing the number of searches in a given period(3) Analyze the number of clicks of a keyword at a certain time(4) Through these data, find out some search without clicks, search with click, se
keywords, to assess the quality of our station search, to optimize the search scheme, and improve to provide some reference(6) Use Lucene or SOLR indexes to store analyzed data and provide flexible and powerful retrieval methodsThe specific use of pig analysis data process, the scattered fairy here is not fine, interested friends, can be in the public post-message consulting, today mainly look at, pig analysis of the results of the data stored in the
introduction what's Lucene Lucene is a function library for information retrieval (Library ), you can add the index and search functions for your application .
LuceneUsers do not need to learn more about full-text search.,Just learn to use a class in the library,You can implement full-text search for your application..
But never think Lucene
Lucene is a high-performance Java full-text retrieval toolkit that uses the Inverted File index structure. This structure and corresponding generation
Algorithm As follows:
0) There are two
Article 1 and 2
Article 1: Tom lives in Guangzhou, I live in Guangzhou too.
The content of article 2 is: he once lived in Shanghai.
1) Because Lucene is based on keyword indexing and query, we need to obtain the keyw
little low-level but not quite as bad as building a custom Lucene query: customscorequery. when you implement your own Lucene query, you're taking control of two things:
Matching-What sort ents shocould be added in the search resultsScoring-What score shocould be assigned to a document (and therefore what order shoshould they appear in)Frequently you'll find that existing
1. Basic ApplicationsUsing system;
Using system. Collections. Generic;
Using system. text;
Using Lucene. net;
Using Lucene. net. analysis;
Using Lucene. net. analysis. Standard;
Using Lucene. net. documents;
Using Lucene. net. index;
Using
carding of knowledge points: distinguishing between lucene retrieval and database Retrieval performance: Database: A full table scan of data in a database: low performance; Lucene: First index the data and then find it based on the index that was established. (more than creating an index of such a process, we are created once, many times); correlation sort: Database: ORDER by ID sorted according to order;
I have also learned some design patterns before, but since I have no practice, even though I thought I understood it all at the time, it was actually half-known. I personally think that it is not enough to read books and explain them by others. You must practice or understand their advantages in person before you can have a deeper understanding. InCodeI found that he used the decorator mode, and it was quite clever. So I took this case into considerat
Lucene application experiences and comparison of several Chinese Word divider:
1. index creation and keyword search run in different systems
If you write the index creation and keyword search on the backend and foreground systems respectively, and then deploy these two systems under the same application server (like a tomcat6.0 ), the following occurs: (a) when you click Create index in the background to run normally, an exception is reported when you
Document directory
Allow the database to perform exact matching and implement Fuzzy Matching using an independent system
Data Synchronization Policy
Result sorting Policy
Keyword indexing of search results
Author: chelong Email: chedongatbigfoot.com/chedongatchedong.com
Last Update written on: 2003/05:03/16/2005 16:30:45Feed Back> (read this before you ask question)
Copyright Disclaimer: You can reprint the document at will. During reprinting, you must mark the original source and author
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.