Why do I have to search the system
With the increase in the number of products, and complex retrieval requirements, directly from the database to retrieve information, it has been unable to meet the needs of the display machine search.
Instance:
Keyword=%e8%8b%b9%e6%9e%9c&enc=utf-8 ">http://search.jd.com/search?keyword=%e8%8b%b9%e6%9e%9c&enc= Utf-8
Http://www.yougou.com/sr/searchKey.sc?keyword=%E5%A5%B3%E9%9E%8B%E5%A4%A9%E7%BE%8E%E6%84%8F
The search system needs to be introduced at this time.
The most commonly used frameworks for search systems are: SOLR, ElasticSearch. They are all built on Lucene.
This article demonstrates the search system. The framework used is: Solr4.9.0. About the use of the SOLR framework. To visit the website:
http://lucene.apache.org/solr/
http://blog.csdn.net/puma_dong/article/details/38880699
System description
Basic information
Demonstrates the Dubbo interface for full-scale indexing of commodity information, master-slave configuration, and search.
SOLR is an introductory note that basically meets the daily application of SOLR-based search. For many other SOLR's set of parameters, in-depth research needs to continue to summarize progress in practice.
About indexes. The basic content includes, for example, the following:
Product (Code, section number, name, price, size number, size name, color, price, discount, image link, sales).
Classification (name, alias, encoding, phonetic name).
Brand (code, Chinese and English names, aliases, pinyin names, initials pinyin name);
Property Item (attribute value) of the commodity;
And some sort of information: sales, prices, discounts, etc.;
For brand classification and so on, need to record English name at the same time.
The index also needs some management control functions, such as dirty word masking, extended thesaurus, and so on.
In order to improve the efficiency of index building. Some intermediate results may also need to be calculated, for example: the 2-week sales volume of the product.
Note: For classification aliases, brand aliases, and so on, it is not recommended to do this separately in the search system. Suggest to the commodity management system to ask for demand.
This project but the prototype of the demonstration, the process is available, there is no complete information on the complete index creation, indexing interface, and management control functions, this is left to have enough spare time later.
The way an index is established is performed such as the following: Crontab */10 * * * */usr/local/cl/create_index.sh &.
Technical framework
In the index establishment project. No matter what the framework, using the most basic JDK code, timed tasks using crontab. Task flow control adopts Linux shell command.
The index query interface project still uses Dubbo to provide the interface.
Client uses SOLRJ.
Chinese word segmentation uses IK Analyzer 2012FF_HFL.
Code description
predecessor project:http://blog.csdn.net/puma_dong/article/details/9854899
Latest source code:git clone [email protected]:p umadong/cl-search.git.
Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.
Query system--based on Solr4.9.0 implementation