Http://ayueer.spaces.live.com/blog/cns! 9e99e1260983291b! 1338. The entry uses SOLR to build a Chinese search application
To build a small search application, the data source already exists in MySQL. The previous version uses MySQL like. This upgrade requires enhanced scalability and performance, add some features.
There are several options available,
1. extended on the basis of the original MySQL like and used the full-text search function of MySQL, which was not done in this aspect. Considering performance and other factors, it was first excluded.
2. When applied to the company's own search mechanism, the problem is that the current coupling between the search page and the search is too serious and should be applied in this framework, it is difficult to customize your own rank and display changes,CodeIt seems a little scary.
3. Finally, we decided to use Lucene + SOLR. Lucene provides a powerful index Retrieval Interface. SOLR encapsulates it very easily to facilitate the expansion of various languages, you do not need to implement the search Server Based on Lucene API by yourself. Submitting the index creation and query of DOC files is performed through HTTP requests, and the results in XML/JSON format can be returned, which is very convenient.
Build services:
1. Download Lucene, SOLR, and tomcat, and decompress SOLR under the SOLR Dist. jar is put under tomcat/webapps and named SOLR. war, copy example/SOLR under the SOLR directory to the current directory or configure tomcat to tell it SOLR directory, start Tomcat to access http: // localhost: port/SOLR/admin can see that SOLR is running.
2, will use the Chinese word segmentation package, I here choose jesoft je-analysis.jar put SOLR/lib, configure SOLR/CONF/Schema. XML
<Fieldtype name = "text_chinese" class = "SOLR. textfield">
<Analyzer class = "jeasy. analysis. mmanalyzer"/>
</Fieldtype>
3. Modify fields in SOLR/CONF/Schema. XML to customize the domain to be retrieved.
4. Convert SOLR to PhP on IBM developerworksArticleThe code in, encapsulate the HTTP request and the XML of the constructed Doc and call it in PHP. You can refer to otherProgram. Apache Lucene quick-start guide
5. In this way, we have built the most basic search framework. It depends on your imagination to do anything on this :)
References:
1. Search smarter with Apache SOLR, Part 1-Essential Features and the SOLR Schema
2, search smarter with Apache SOLR, Part 2-SOLR for the Enterprise
3. Apache Lucene quick-start guide
4. Use SOLR to build your full-text search-my knowledge base
5, http://jesoft.cn Je-analysis mmanalyzer Chinese word segmentation.
6. Lucene Chinese Word Segmentation- Ding jieniu paoding analysis, another awesome word segmentation, open-source, pretty good.
7, SOLR home page http://lucene.apache.org/solr/, there is a good wiki about tomcat configuration deployment and some advanced features of advanced Faceted search and cache can be seen here.
23: 35 | Write logs
Please wait... Sorry, the comment you entered is too long. Please shorten your comments. You have not entered anything. Please try again. Sorry, we cannot add your comment currently. Try again later. To add a comment, your parent must grant you the relevant permissions. Your parent has disabled the comment feature. Sorry, we cannot delete your comment currently. Try again later. You have exceeded the maximum number of comments allowed in one day. Please try again in 24 hours. Because our system indicates that you may be providing spam comments to other users, your account has disabled the comments feature. If you believe that your account is incorrectly disabled, contact the Windows Live Support Department. The following security check can be completed only when you provide comments.
The characters you typed in the security check must be the same as those in the image or audio.
To add comments, use your Windows Live ID to log on (if you have used Hotmail, messenger, or Xbox Live, you have a Windows Live ID ). Login
Are there any Windows Live IDs? Please register
Reference announcement
The reference advertised URL for this log is: http://ayueer.spaces.live.com/blog/cns! 9e99e1260983291b! 1338. Network Logs referenced by Trak