Previous articles about SOLR in the import of data for Word segmentation, indexing, either by importing local XML or directly on the page to fill out the XML. But in reality, in many cases the data source is from the database. Therefore, this article takes MySQL as an example to carry on a more detailed introduction. It is used for "DataImport".
1, add in Conf\solrconfig.xml, increase the import data function
<requesthandler name= "/dataimport" class= "Org.apache.solr.handler.dataimport.DataImportHandler" > <lst Name= "Defaults" > <str name= "config" >data-config.xml</str> </lst> </requestHandler>
2, add a data source Data-config.xml in the conf\ directory, the code is as follows:
<dataconfig> <datasource type= "Jdbcdatasource" driver= " Com.mysql.jdbc.Driver " url=" Jdbc:mysql://172.0.0.1:3306/cmntadmin " user=" Root " password=" "/> <document name=" content > <entity name= "Node" query= "Select id,username, Creator from forbiduser "> <field column= "id" name= "id" /> <field column= "username" name= "name" /> <field column= "creator" name= "Contents" /> </entity> </document></ Dataconfig>
The information for the data source is configured here. The entity's content is derived from the results of query queries. fields correspond to the field information queried: "column" corresponds to the database field name, "name" must correspond to the field value configured in "Schema.xml".
3. Create Schema.xml syntax
<?xml version= "1.0" encoding= "UTF-8" ?><schema name= "Example" version= "1.5" ><fields> <!-- if you remove this field, you must _also_ disable the update log in solrconfig.xml or solr won ' t start. _version_ and update log Are required for solrcloud --> <field name= " _version_ " type=" Long " indexed=" true " stored=" true "/> <!-- points to the root document of a block of nested documents. required for nested document support, may be removed otherwise --> <field name= "_ Root_ " type=" string "&Nbsp;indexed= "true" stored= "false"/> <field name= "id" type= "string" indexed= "true" stored= "true" required= "true" multivalued= "false" /> <field name= "name" type= "Text_general" indexed= "true" stored= "true"/ > <field name= "Contents" type= "Text_ik" indexed= "true" stored = "true"/> </fields> <!-- field to use to determine and enforce document uniqueness. Unless this Field is marked with required= "false", it will be a required field --> <uniqueKey>id</uniqueKey> <!-- deprecated: The defaultSearchField is consulted by various query parsers When parsing a&nbsP;query string that isn ' T explicit about the field. machine (Non-user) generated queries are best made explicit, or they can use the "DF" request parameter which takes Precedence over this. note: un-commenting defaultsearchfield will be insufficient if your request handler in solrconfig.xml defines "DF", which takes precedence. that would need to be Removed.--> <defaultsearchfield>contents</defaultsearchfield><copyfield source= " Name " dest=" Contents "/><solrqueryparser defaultoperator=" OR "/><types> < Fieldtype name= "string" class= "SOLR. Strfield " sortmissinglast=" true " /><fieldtype name=" Long " class=" SOLR. TrielongfIeld " precisionstep=" 0 " positionincrementgap=" 0 "/><fieldtype name=" text_general " Class= "SOLR. TextField " positionincrementgap=" "> <analyzer type=" Index "> <tokenizer class=" SOLR. Standardtokenizerfactory "/> <filter class=" SOLR. Stopfilterfactory " ignorecase=" true " words=" Stopwords.txt " /> <!-- in this example, we will only use synonyms at query time <filter class= "SOLR. Synonymfilterfactory " synonyms=" Index_synonyms.txt " ignorecase=" true " expand=" false "/> --> <filter class= "SOLR. Lowercasefilterfactory "/> </analyzer> <analyzer type= "Query" > <tokenizer class= "SOLR. Standardtokenizerfactory "/> <filter class=" SOLR. Stopfilterfactory " ignorecase=" true " words=" Stopwords.txt " /> <filter class= "SOLR. Synonymfilterfactory " synonyms=" Synonyms.txt " ignorecase=" true " expand=" true "/> <filter class= "SOLR. Lowercasefilterfactory "/> </analyzer> </ Fieldtype><fieldtype name= "Text_ik" class= "SOLR. TextField "> <analyzer class=" Org.wltea.analyzer.lucene.IKAnalyzer "/> </fieldtype> </typeS></schema>
Important fields in the Schema.xml:
To have this Copyfield field, SOLR can retrieve the values of multiple fields (the following settings will search for values in id,name,contents) <defaultsearchfield>contents</ Defaultsearchfield>
Copyfield is used to copy the value of your field to another field. If you can copy the contents of name to default, then SOLR will retrieve the name in the search.
<copyfield source= "name" dest= "Contents"/>
4. Import the relevant JAR package
Because this article uses MySQL as the data source, Therefore, the driver package (Mysql-connector.jar) is required, and the DataImport function requires Solr-dataimporthandler-4.7.2.jar and Solr-dataimporthandler-extras-4.7.2.jar , these two jar packages do not need to be downloaded and are available in the \dist directory.
Copy these three jar packages into the Lib directory under the SOLR project under Tomcat (Webapps\solr\web-inf\lib).
5. Create an index
Restart Tomcat.
A), you can trigger the creation of a full-scale index by means of a URL:
Http://localhost:8080/solr/dataimport?command=full-import
B), via the "DataImport" module on the admin page:
650) this.width=650; "style=" width:822px;height:285px; "title=" Diweb.png "src=" http://s3.51cto.com/wyfs02/M02/4E/ 62/wkiom1rgf7ydd8okaakjj6t62mk109.jpg "width=" 874 "height=" 321 "alt=" wkiom1rgf7ydd8okaakjj6t62mk109.jpg "/>
This article is from "Flying snail" blog, please be sure to keep this source http://flyingsnail.blog.51cto.com/5341669/1575075
SOLR (iv)---the MySQL database into an indexed data source