Full-Text search engine SOLR series--SOLR core concepts, configuration files

Last Update:2016-01-11 Source: Internet

Author: User

Tags relational database table solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document

Document is the most basic unit of the SOLR index (verb, indexing) and search, which resembles a record in a relational database table and can contain one or more fields (field), each containing a name and text value. A field can be stored in an index while it is indexed, and the value of the field can be returned when the search is searched, and usually the document should contain an ID field that uniquely represents the document. For example:

12345678 <doc> <field name="id">company123</field> <field name="companycity">Atlanta</field> <field name="companystate">Georgia</field> <field name="companyname">Code Monkeys R Us, LLC</field> <field name="companydescription">we write lots of code</field> <field name="lastmodified">2013-06-01T15:26:37Z</field></doc>

Schema

The schema in SOLR is similar to the table structure in a relational database, it exists in the Conf directory as Schema.xml text, and when added to the index, you need to specify that the Schema,schema file contains three main parts: field, field type (FieldType), unique key (UniqueKey)

field type (FieldType): Used to define the type that is added to the XML file fields (field) in the index, such as: Int,string,date,
Field: The name of the field when added to the index file
Unique key (UniqueKey): UniqueKey is a field that identifies the uniqueness of the document (Feild), which is used when updating and deleting

For example:

1234567891011121314151617181920212223 <schema name="example" version="1.5"> <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/> <uniqueKey>id</uniqueKey> <fieldType name="string" class="solr.StrField" sortMissingLast="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />  <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType></schema>

Field

In Solr, a field is the basic unit that forms the document. Corresponds to a column in a database table. A field is a metadata that includes the name, type, and how the value corresponding to the field is handled. Like what:

<field name="name" type="text_general" indexed="true" stored="true"/>

Indexed:indexed=true, indicates that the field is added to the index by SORL processing, and only the indexed fields can be searched.
Stored:stored=true, the field value is stored in the index as a copy of the original content and can be returned by the search component component, which is not suitable for long text storage in the index, due to performance issues.

Field Type

Each field in SOLR has a corresponding field type, such as float, long, double, date, TEXT,SOLR provides a rich field type, and we can also customize the data types that are appropriate for you, such as:

12345678910  <fieldType name="text_cn_stopword" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.wltea.analyzer.lucene.IKAnalyzerSolrFactory" useSmart="false"/> </analyzer> <analyzer type="query"> <tokenizer class="org.wltea.analyzer.lucene.IKAnalyzerSolrFactory" useSmart="true"/> </analyzer> </fieldType> 

Solrconfig:

If the schema is defined as SOLR's model, then Solrconfig is the SOLR configuration, which defines SOLR if it handles many requests such as indexing, highlighting, searching, and also specifies a cache policy, with more elements including:

Specify the index data path

123456 <dataDir>${solr.data.dir:./solr/data}</dataDir>

Cache parameters

12345678910111213141516171819202122 <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0"/>  <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>  <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>

Request processor
The request processor is used to receive the HTTP request, and after processing the search, the processor returns the result of the response. For example: Query request:

123456789 <requestHandler name="/query" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="wt">json</str> <str name="indent">true</str> <str name="df">text</str> </lst></requestHandler>

每个请求处理器包括一系列可配置的搜索参数，例如：wt,indent,df等等。

Full-Text search engine SOLR series--SOLR core concepts, configuration files

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More