Integrated PHP application and SOLR search engine

Source: Internet
Author: User
Tags documentation solr git clone


Why do you need a search engine. A simple database is not enough. If you just create a small site, the database is enough. But when you create neutral or large applications, search engines are a smarter choice. Of course, a small web site can also use SOLR to get highly correlated search results.

Imagine you are writing a search query program for an ecommerce website. The most immediate idea is the following database query statement:

SELECT * from the products
WHERE LOWER (title) like LOWER ('% $phrase% ')
OR LOWER (description) Like LOWER ('% $phrase% ') ;

When you query for a phrase in a title or description, it works fine. But real things are complicated, like Apple's iphone 4G black 16GB (Apple 4G internet iphone, 16GB). When searching for "IPhone 16G", there is no result. You can use the% to replace the space to deal with this situation.

$phrase = Str_replace (', '% ', $phrase);

What about the "IPhone 16GB 4G"? Obviously the word order has changed and it doesn't work properly. I guess you'll add a field to save the word order. Then how to write the wrong word. What to do with synonyms. It's challenging to think of a good solution for such a search system.

Designing an ingenious algorithm is not the key to solving this type of problem. The text searches for the element to consume the resource. Putting too much pressure on the database side is never a good idea. The reason is that the database cannot be easily extended. You can't simply add an instance to a Web server or memcached. Extended database requires some preparation, code modification, configuration, down machine maintenance time, in short, the cost is very expensive. The good news is that SOLR is specifically addressing this type of problem.

SOLR is an enterprise-level search platform based on Apache Lucene. Fast, stable, and have good documentation of course it is also very convenient to expand. Since SOLR has a powerful solution, all of his features are not listed in this article. This guy is pretty easy to put away, too.

First download the latest version from the official website. SOLR is an application written in the Java language, and you need a Java runtime environment to run.

$ CD solr-4.1.0/example/
In about a few seconds you will be able to see the following message:
2013-03-09 18:47:41.177:info:oejs. Abstractconnector:started socketconnector@0.0.0.0:8983

SOLR has a web interface that works under Port 8983 and opens a browser to access http://localhost:8983/solr/.

On the left-hand side of the navigation area you'll find "Collection1″." In SOLR, collections is similar to a database table, and you can query the data. Click on a collection and select her submenu "Query".

The first option is called "Request-handler (QT)", which has the default value "/select". Request handlers is a set of predefined queries. If you look at the SOLR config file, you'll see this:

$ vim Solr-4.1.0/example/solr/collection1/conf/solrconfig.xml
<requesthandler name= "/select" class= "SOLR." Searchhandler ">
    <lst name=" Defaults ">
        <str name=" Echoparams ">explicit</str>
       <int name= "Rows" >10</int>
       <str name= "DF" >text</str>
    </lst>
</ Requesthandler>
The second parameter is the one we are most interested in. The default value "*:*" means querying any content, and if you click "Execute Query" you can get something like the following:
<?xml version= "1.0" encoding= "UTF-8"?>
<response>
    <lst name= "Responseheader" >
        < int name= "status" >0</int>
        <int name= "Qtime" >1</int>
        <lst name= "params" >
        <str name= "indent" >true</str>
        <str name= "q" >*:* </str>
        <str name= "WT" >xml </str>
        </lst>
    </lst>
    <result name= "response" numfound= "0" start= "0"/>
</response>
The index result is empty, but this is not a problem, you need to insert some sample data.
$ cd solr-4.1.0/example/exampledocs/
$ java-jar post.jar monitor.xml
 
simpleposttool version 1.5
Posting The files to base URL http://localhost:8983/solr/update using Content-type application/xml.
POSTing file Monitor.xml
1 files indexed.
Committing SOLR index changes to Http://localhost:8983/solr/update.

Now you can return to the query interface, and this time there will be a document returned.

The collection data structure is defined in the schema file.

$ vim Solr-4.1.0/example/solr/collection1/conf/schema.xml
This file has a lot of annotations and you can easily tell what they are doing. If you want to modify the scheme file, do not delete the field named "Text" (if there is no good reason), he is associated with another field and query statement (including SELECT, look, and so on).
$ grep Text Solr-4.1.0/example/solr/collection1/conf/schema.xml | grep copy
 
<copyfield source= "Cat" dest= "text"/>
<copyfield source= "name" dest= "text"/>
< Copyfield source= "Manu" dest= "text"/> <copyfield source=
"Features" dest= "text"/>
<copyfield Source= "includes" dest= "text"/>
<copyfield source= "title" dest= "text"/> <copyfield source=
" Author "dest=" text "/>
<copyfield source=" description "dest=" text "/>
<copyfield the source=" Keywords "dest=" text "/>
<copyfield source=" content dest= "text"/>
<copyfield source= "Content_ Type "dest=" text "/>
<copyfield source=" resourcename "dest=" text "/> <copyfield
" url " dest= "Text"/>

If you are using a relational database, you do not want to have duplicate data. SOLR is not a database, and multiple-digit segments are processed into text fields, as is the default request handler.

A client is required to access SOLR from PHP. I suggest downloading one from the pecl. It is fast, the API is clear, the document is fine. Note, however, that this extension is now 1.0.2 and does not support SOLR 4.x. The protocols for SOLR 3.x and 4.x are slightly different. But don't worry, I've made changes and you can download the compatible version from HTTPS://GITHUB.COM/LUKASZKUJAWA/PHP-PECL-SOLR. I've been using it for a while and it's reliable. It's a little different from the official one. A SOLR version parameter in the Solrclient constructor. This patch will be released in the official version, and you don't have to worry about compatibility after that.

$ git clone https://github.com/lukaszkujawa/php-pecl-solr.git
$ cd php-pecl-solr/
$ phpize
$ whereis Php-config
php-config:/usr/bin/php-config/usr/bin/x11/php-config
$/configure--with-php-config=/usr/ Bin/php-config
$ make
$ make install

Add to your php.ini

Extension=solr.so

Restart the Web server

$/etc/init.d/apache2 Restart

Now you can write PHP to add content to the index.

<?php
 
$options = array (
    ' hostname ' => ' 127.0.0.1 '
);
 
$client = new Solrclient ($options, "4.0"); Parameter 4.0 is for solr4.x, other versions are ignored
 
$doc = new Solrinputdocument ();
 
$doc->addfield (' id ');
$doc->addfield (' title ', ' Hello wolrd ');
$doc->addfield (' description ', ' Example Document ');
$doc->addfield (' Cat ', ' Foo ');
$doc->addfield (' Cat ', ' Bar ');
 
$response = $client->adddocument ($doc);
 
$client->commit ();
 
/*-------------------------------
 
/* * $query = new Solrquery ();
 
$query->setquery (' hello ');
 
$query->addfield (' id ')
->addfield (' title ')
->addfield (' description ')
->addfield (' Cat ');
 
$queryResponse = $client->query ($query);
 
$response = $queryResponse->getresponse ();
 
Print_r ($response->response->docs);

If you add more than one document, she can handle it well, without requiring frequent commits.

It's valuable to know how SOLR works, and you can use it in a number of projects. She has a great feature that allows you to pull all the data you need once you've asked for it. Of course, it takes a while for you to master her, but it's worth the effort. SOLR has an active community and complete documentation resources, and if you're worried about using it in your project, go through the SOLR 3 Enterprise Search server, not just to get you to quickly build a search service, but to base your data mining.


Article Source: Http://www.oschina.net/translate/integrate-php-application-with-solr-search-engine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.