1. Software Download
(1). apache-solr-3.1.0, write the latest version of this article, please go to the Apache official website to download, unzip to E:/apache-solr-3.1.0.
(2). Download APACHE-Tomcat-6.0.32 from the Apache official website and decompress it to E:/Apache-Tomcat-6.0.32.
2. Install SOLR on Tomcat
(1). Modify E:/Apache-Tomcat-6.0.32/CONF/server. XML, add a uriencoding = "UTF-8", change the part of 8080:
<Connector Port = "8080" protocol = "HTTP/1.1" <br/> connectiontimeout = "20000" <br/> redirectport = "8443" uriencoding = "UTF-8"/> <br/>
(2) Save the following content to E:/Apache-Tomcat-6.0.32/CONF/Catalina/localhost/SOLR. XML, which is not created in this directory.
<Context docbase = "E:/apache-solr-3.1.0/Dist/apache-solr-3.1.0.war" reloadable = "true"> <br/> <environment name = "SOLR/home" type = "Java. lang. string "value =" E:/apache-solr-3.1.0/example/SOLR "override =" true "/> <br/> </context> <br/>
E:/apache-solr-3.1.0/example/SOLR this directory can be used as a template for configuring SOLR, this directory will be used as SOLR home later
If you copy example/SOLR to another directory (such as C:/soft/SOLR), you need to modify the file $ solr_home/CONF/solrconfig. XML, find the datadir settings,
Datadir is the index storage directory. The default value is <datadir >$ {SOLR. data. dir :. /SOLR} </datadir>. The relative path is used. You need to change it to the complete path: <datadir >$ {SOLR. data. dir: C:/soft/SOLR/Data} </datadir>
(3) start Tomcat and open the http: // localhost: 8080/SOLR/admin/view interface. The following interface is displayed, indicating that the configuration is successful.
3. configuration file (1 ). e: schema under/apache-solr-3.1.0/example/SOLR/CONF. XML. This configuration file is equivalent to the data table configuration file, which defines the data types that are added to the index data. Because the current schema. the content in XML is an example provided by the official team. It is not easy to understand. Now, replace it with your own content, define the data type, and then customize several pieces of corresponding data, run the Java-jar POST command. jar *. XML to generate indexes for users to query <? XML version = "1.0" encoding = "UTF-8"?> </P> <p> <schema name = "example" version = "1.1"> </P> <p> <types> <br/> <fieldtype name = "String "Class =" SOLR. strfield "sortmissinglast =" true "omitnorms =" true "/> <br/> <fieldtype name =" sint "class =" SOLR. sortableintfield "sortmissinglast =" true "omitnorms =" true "/> <br/> <fieldtype name =" date "class =" SOLR. datefield "sortmissinglast =" true "omitnorms =" true "/> </P> <p> <fieldtype name =" text "class =" SOLR. textfield "positionincrementgap =" 100 "> <br/> <analyzer> <br/> <tokenizer class =" SOLR. cjktokenizerfactory "/> <br/> </Analyzer> <br/> </fieldtype> <br/> </types> </P> <p> <fields> <br /> <field name = "ID" type = "sint" indexed = "true" stored = "true" required = "true"/> <br/> <field name =" user "type =" string "indexed =" true "stored =" true "/> <br/> <field name =" title "type =" text "indexed =" true" stored = "true"/> <br/> <field name = "content" type = "text" indexed = "true" stored = "true"/> <br/> <field name = "timestamp" type = "date" indexed = "true" stored = "true" default = "now"/> <br/> <field name = "text" Type = "text" indexed = "true" stored = "false" multivalued = "true"/> <br/> </fields> <br/> <uniquekey> id </uniquekey> <br/> <defasearchsearchfield> text </defaultsearchfield> <br/> <solrqueryparser defaultoperator = "and"/> <br/> <copyfield source = "title" DEST =" text "/> <br/> <copyfield source =" content "DEST =" text "/> <br/> </Schema> </P> <p>
(). First, define a fieldtype subnode in the types node, including parameters such as name, class, and positionincrementgap. Name is the name of fieldtype, and the class points to Org. apache. SOLR. the class name corresponding to the analysis package, used to define this type of behavior. When fieldtype is defined, the most important thing is to define the analyzer used to index and query data of this type, including word segmentation and filtering. In this example, when defining the fieldtype text, use the official CJK word segmentation package in the index analyzer.
(B ). the next step is to define a specific field (similar to a field in a database) in the fields node, that is, filed. The filed definition includes name and type (for various fieldtypes previously defined ), indexed, stored, multivalued, and so on.
(2) Restart Tomcat and the following error will be reported:
Org. Apache. SOLR. Common. solrexception: queryelevationcomponent requires the schema
Have a uniquekeyfield implemented using strfield.
This error is caused by a high SOLR version. in earlier versions, there will be no errors. You can delete two nodes (namely the elevation component) in solrconfig. XML to solve this problem:
<! -- A search component that enables you to configure the top results for <br/> A given query regardless of the normal Lucene scoring. --> <br/> <searchcomponent name = "elevator" class = "SOLR. queryelevationcomponent "> <br/> <! -- Pick a fieldtype to analyze queries --> <br/> <STR name = "queryfieldtype"> string </STR> <br/> <STR name = "config-file"> elevate. XML </STR> <br/> </searchcomponent> </P> <p> <! -- A request handler utilizing the elevator component --> <br/> <requesthandler name = "/elevate" class = "SOLR. searchhandler "Startup =" lazy "> <br/> <lst name =" defaults "> <br/> <STR name =" echoparams "> explicit </STR> <br/> </lst> <br/> <arr name = "last-components"> <br/> <STR> elevator </STR> <br/> </ARR> <br/> </requesthandler> <br/>
Manually create two XML data files on E:/apache-solr-3.1.0/example/exampledocs. Save as demo-doc1.xml and demo-doc2.xml, respectively, the contents of these files should be consistent with the data structure defined in schema. XML, the demo-doc1.xml is as follows:
<? XML version = "1.0" encoding = "UTF-8"?> <Br/> <add> <br/> <Doc> <br/> <field name = "ID"> 1 </field> <br/> <field name =" user "> chenlb </field> <br/> <field name =" title "> SOLR application speech </field> <br/> <field name =" content "> the first section describes how to submit data to the server for indexing, here we have some data, such as the server. You can try to find it. </Field> <br/> </DOC> <br/> </Add> <br/>
Demo-doc2.xml:
<? XML version = "1.0" encoding = "UTF-8"?> <Br/> <add> <br/> <Doc> <br/> <field name = "ID"> 2 </field> <br/> <field name =" user "> Bory. chan </field> <br/> <field name = "title"> Search Engine </field> <br/> <field name = "content"> there are many search servers data. </Field> <br/> <field name = "timestamp"> 2009-02-18t00: 00: 00Z </field> <br/> </DOC> <br/> <Doc> <br/> <field name = "ID"> 3 </field> <br/> <field name = "user"> Other </field> <br/> <field name = "title"> what is this </field> <br/> <field name = "content"> what kind of sports do you like? Basketball? </Field> <br/> <field name = "timestamp"> 2009-02-18t12: 33: 05.123z </field> <br/> </DOC> <br/> </Add> <br/>
Windows default file files are ANSI encoded, note that these two files must be saved in UTF-8, otherwise an error will be reported when submitting the build Index
Submit the data for indexing, to E:/apache-solr-3.1.0/example/exampledocs, run:
E:/apache-solr-3.1.0/example/exampledocs> JAVA-durl = http: // localhost: 8080/SOLR/update-dcommit = yes-jar post. jar demo-Doc *. XML <br/> simpleposttool: version 1.2 <br/> simpleposttool: Warning: Make sure your XML documents ENTs are encoded in UTF-8, other encodings are not currently supported <br/> simpleposttool: POSTing files to http: // localhost: 8080/SOLR/update .. <br/> simpleposttool: posting file demo-doc1.xml <br/> simpleposttool: posting file demo-doc2.xml <br/> simpleposttool: Committing SOLR index changes ..
At this point to the E:/apache-solr-3.1.0/example/SOLR/data/index directory, you can find and Lucene created index generated similar files
View search results:
Search user = Bory. Chan: http: // localhost: 8080/SOLR/select /? Q = USER % 3abory. Chan & version = 2.2 & START = 0 & rows = 10 & indent = on
<? XML version = "1.0" encoding = "UTF-8"?> <Br/> <response> </P> <p> <lst name = "responseheader"> <br/> <int name = "status"> 0 </int> <br/> <int name = "qtime"> 0 </int> <br/> <lst name = "Params"> <br/> <STR name = "indent"> on </STR> <br/> <STR name = "start"> 0 </STR> <br/> <STR name = "Q"> User: bory. chan </STR> <br/> <STR name = "rows"> 10 </STR> <br/> <STR name = "version"> 2.2 </STR> <br/> </lst> <br/> <result name = "response" numfound = "1" Start = "0"> <br/> <Doc> <br/> <STR name = "content"> the search server has a lot of data. </STR> <br/> <int name = "ID"> 2 </int> <br/> <date name = "timestamp"> 2009-02-18t00: 00: 00Z </date> <br/> <STR name = "title"> Search Engine </STR> <br/> <STR name = "user"> Bory. chan </STR> <br/> </DOC> <br/> </result> <br/> </response> <br/>
Through this simple example, you should have some knowledge about SOLR. Next, let's take a look at how to add Chinese Word Segmentation and how to build your own application server with SOLR.
Reference connection:
1. http://blog.chenlb.com/2009/05/apache-solr-quick-start-and-demo.html
2. http://lianj-lee.iteye.com/blog/424693
When I make some changes on the basis of my predecessors, I will pay attention to the source in the future. This is a respect for the fruits of others' work!
Tailism can be said to be a kind of plagiarism. Please respect originality!