Integrated PHP application and SOLR search engine

Source: Internet
Author: User
Tags solr
Why do you need a search engine? Simple database is not enough? If you just create a small website, the database is enough. But when you create a neutral or large application, the search engine is a smarter choice. As a matter of course, small sites can also use SOLR to obtain high-relevance search results.

Imagine that you are writing a search query program for an ecommerce site. The most straightforward idea is the following database query statement:

SELECT * from Productswhere LOWER (the title) like LOWER ('% $phrase% ') or LOWER (description) Like LOWER ('% $phrase% ');

When you query a phrase in the title or description, it works fine. But the real thing is complicated, like Apple's iphone 4G black 16GB (Apple 4G internet iphone, 16GB). When searching for "IPhone 16G", there is no result. You can use% to replace spaces to cope with this situation.

$phrase = Str_replace (', '% ', $phrase);

What about the "IPhone 16GB 4G" query? Obviously the word order has changed and it's not working properly. I guess you'll add a field to save the word order. What if you write the wrong words? What about synonyms? It is challenging to think of a good solution for such a search system.


Designing a sophisticated algorithm is not the key to solving this kind of problem. The search for text needs to consume resources. Putting too much pressure on the database side is never a good idea. The reason is that the database cannot be easily extended. You can't simply add an instance like Web server or memcached. Extending the database requires some preparation, code modification, configuration, down-time maintenance, and in short, the cost is very expensive. The good news is that SOLR is specifically addressing this type of problem.

SOLR is an enterprise-class search platform based on Apache Lucene. Fast, stable, and good documentation of course it's easy to expand. Since SOLR has a strong solution, all of his features are not listed in this article. This guy is pretty easy to pick up.

First download the latest version from the official website. SOLR is an application written in the Java language, and you need a Java runtime environment to run it.

$ cd solr-4.1.0/example/$ Java-jar Start.jar

In a few seconds you can see the following message:

2013-03-09 18:47:41.177:info:oejs. Abstractconnector:started socketconnector@0.0.0.0:8983

SOLR has a web interface that works under 8983 ports and opens a browser to access http://localhost:8983/solr/.

On the left-hand side of the navigation area you will find "Collection1″." In SOLR, collections is similar to a database table, and you can query the data. Click on a collection and select her submenu "Query".

The first option, called Request-handler (QT), has a default value of "/select". Request handlers is a set of predefined queries. If you look at the SOLR config file, you will see this:

$ vim Solr-4.1.0/example/solr/collection1/conf/solrconfig.xml
<requesthandler name= "/select" class= "SOLR. Searchhandler ">    <lst name=" Defaults ">        <str name=" Echoparams ">explicit</str>       <int name= "Rows" >10</int>       <str name= "DF" >text</str>    </lst></ Requesthandler>

The second parameter is what we are most interested in. The default value of "*:*" means to query anything, and if you click "Execute Query" you can get something like this:

<?xml version= "1.0" encoding= "UTF-8"?><response>    <lst name= "Responseheader" >        <int Name= "status" >0</int>        <int name= "Qtime" >1</int>        <lst name= "params" >        < STR name= "indent" >true</str>        <str name= "q" >*:* </str>        <str name= "WT" >xml</ str>        </lst>    </lst>    <result name= "response" numfound= "0" start= "0"/></ Response>

The index result is empty, but this is not a problem and you need to insert some sample data.

$ cd solr-4.1.0/example/exampledocs/$ Java-jar post.jar monitor.xml simpleposttool version 1.5Posting files to base URL h Ttp://localhost:8983/solr/update using Content-type application/xml. POSTing file Monitor.xml1 files indexed. Committing SOLR index changes to Http://localhost:8983/solr/update.

Now you can go back to the query interface and this time there will be a document returned.

The collection data structure is defined in the schema file.

$ vim Solr-4.1.0/example/solr/collection1/conf/schema.xml

This file has a lot of comments and you can easily tell what they are doing. If you want to modify the scheme file, do not delete the field called "Text" (if there is no good reason), he is associated with another field and query statement (including SELECT, look, etc.).

$ grep Text Solr-4.1.0/example/solr/collection1/conf/schema.xml | grep copy <copyfield source= "cat" dest= "text"/><copyfield source= "name" dest= "text"/><copyfield source = "Manu" dest= "text"/><copyfield source= "Features" dest= "text"/><copyfield source= "includes" dest= "text" /><copyfield source= "title" dest= "text"/><copyfield source= "Author" dest= "text"/><copyfield Source= "description" dest= "text"/><copyfield source= "keywords" dest= "text"/><copyfield source= "content "dest=" text "/><copyfield source=" Content_Type "dest=" text "/><copyfield source=" resourcename "dest=" Text "/><copyfield source=" url "dest=" text "/>

If you are using a relational database, you do not want to have duplicate data. SOLR is not a database, and a number of fields are processed into a text field, which is the default request handler.

Accessing SOLR from PHP requires a client. I recommend downloading one from PECL. It's fast, the API is clear and the documentation is good. Note, however, that this extension is now version 1.0.2 and does not support SOLR 4.x. The protocols for SOLR 3.x and 4.x are slightly different. But don't worry, I've made the changes and you can download the compatible version from HTTPS://GITHUB.COM/LUKASZKUJAWA/PHP-PECL-SOLR. I've been using it for some time and it's reliable. It's a little different from the official. One more SOLR version parameter in the Solrclient constructor. This patch will be published in the official version, so you don't have to worry about compatibility later.

$ git clone https://github.com/lukaszkujawa/php-pecl-solr.git$ cd php-pecl-solr/$ phpize$ whereis php-configphp-config :/usr/bin/php-config/usr/bin/x11/php-config$./configure--with-php-config=/usr/bin/php-config$ make$ make install

Add to your php.ini

Extension=solr.so

Restarting the Web server

$/etc/init.d/apache2 Restart

Now you can write PHP to add content to the index.

<?php $options = Array (    ' hostname ' = ' 127.0.0.1 ',); $client = new Solrclient ($options, "4.0");//parameter 4.0 for SOLR4 . x, other versions ignored $doc = new solrinputdocument (); $doc->addfield (' id ', ' n '), $doc->addfield (' title ', ' Hello wolrd '); $doc->addfield (' description ', ' Example Document '), $doc->addfield (' Cat ', ' Foo '), $doc->addfield (' Cat ', ' Bar '); $response = $client->adddocument ($doc); $client->commit (); /*-------------------------------*/$query = new Solrquery (); $query->setquery (' hello '); $query->addfield (' id ')->addfield (' title ')->addfield (' description ')->addfield (' Cat '); $queryResponse = $client->query ($query); $response = $queryResponse->getresponse (); Print_r ($response->response->docs);

If you add more than one document, she can handle it well, without the need for frequent commits.

It's valuable to know how SOLR works, and you can use it in many projects. She has a great feature that allows you to pull all the data you need on a single request. Of course, it takes some time for you to master her, but it's worth the effort. SOLR has an active community and complete documentation resources, and if you're worried about using it in your project, go through the SOLR 3 Enterprise Search server and not just let you quickly build a search service, but also the foundation of your data mining.

Related articles:

PHP's SOLR operation class and demo

Installing the PHP-SOLR Extension

Search Scenario solr+php How do I install the configuration?

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.