SOLR's summary document

Source: Internet
Author: User
Tags solr

Overview of SOLR Summary Document I.

Pre-research using SOLR has been a while, recently due to work reasons, in the study of the spark in Hadoop, so SOLR is temporarily over, here on the front-end time on the use and understanding of SOLR to do a summary, after all, I now only superficial knowledge, not proficient, I'm also handy when I switch back to the index in the future.

The article from the SOLR installation use, Solr/lucene source code structure, index theoretical basis of the three directions to explain. The article will be completed gradually.

Ii. SOLR Installation and use

SOLR is a lucene-based open source search platform that indexes multiple types of data (pdf,txt, etc.), provides full-text indexing and search, and SOLR is extensible, supporting distributed indexing and searching.

SOLR is written in Java and is a full-text search service running on the Jetty,tomcat (Servlet container). SOLR provides a restful interface, so you can communicate with SOLR using any language writer program.

SOLR Installation

Before SOLR installs the JRE on the machine, before downloading SOLR, to see the minimum JRE version required by SOLR, you can view the JRE version of the current system by command: Java–version. The download and configuration of the JRE is not discussed here. SOLR's stand-alone testing requires only:

1, the official website download SOLR;

2, decompression SOLR;

Once decompressed, you can run SOLR as a test, but if you are adding SOLR to the product, you can choose to start the SOLR service with a script:

1, t ar xzf solr-x.y.z.tgz solr-x.y.z/bin/install_solr_service.sh--strip-components=2

2,./install_solr_service.sh solr-x.y.z.tgz

At this point you can use the: Service SOLR status/start/stop command to view/start/close the SOLR service.

The above use script install_solr_service.sh boot method can only be used in Linux class system, you can pass:

./install_solr_service.sh–help

View the parameters that the script starts. Like what:

./install_solr_service.sh solr-x.y.z.tgz-i/opt-d/var/solr-u solr-s solr-p 8983

After using the script to start SOLR, you can view the/etc/init.d/directory, which will contain servicename ServiceName is the parameter after the script starts with-S, the default is SOLR) script, so we can start the SOLR service with the service command. The script content will contain the following:

Solr_install_dir=/opt/solr

solr_env=/etc/default/solr.in.sh

Runas=solr

Where solr_install_dir can be specified by the./install_solr_service.sh-i/your/path/to/install, solr_env we explain below,

In the solr_env script above, this script will contain, for example:

Solr_pid_dir=/var/solr

Solr_home=/var/solr/data

Log4j_props=/var/solr/log4j.properties

Solr_logs_dir=/var/solr/logs

Solr_pid_dir solr_home two parameters, where Solr_pid_dir is running the SOLR PID file directory, we need to note that the second parameter solr_home,solr_home is the index file and the core file in the same directory, The file contains the index file we generated, the core configuration file, and so on. The arguments that follow the solr_home can be specified in the./install_solr_service.sh-d/your/path/to/solr_home, which defaults to/var/solr/data.

The latter two parameters are the specified log path. It has not been carefully studied and is not written for the time being.

SOLR Start-up

After that, you can use the command directly: Service servicename start, and the servicename in this article will use the default value SOLR.

If you unzipped the SOLR installation package directly and did not install the service, then start directly./BIN/SOLR start. However, subsequent content in this article is based on the SOLR service approach.

The current status can be queried by service SOLR status after you start SOLR. Alternatively, you can enter it directly in the browser:

http://ip:port/solr/

查看solr的界面,ip为运行solr的机器的ip,如果是在本机则为localhost,port默认为8983,如果你在启动脚本时使用了-p 参数,那么就为-p指定的值。

Now that SOLR has started, how do I index the document? As we follow, we will now deploy the SOLR source code on eclipse.

SOLR source code deployed on eclipse

1. Download SOLR source code and unzip

2. Download Ant

3. Run ant eclipse in the directory extracted from SOLR

4. Importing SOLR programs in eclipse

5. Modify the path

6. Configure the operating parameters

7, because SOLR with jetty, you can directly use the application way to compile and run

SOLR Sample Code

SOLR configuration file Changes

When we are in the project, we often have to set the schema, that is, to determine which attributes need to be indexed, which properties need to be displayed, and the word breaker used by a property, and use sorting information when searching. In SOLR, this information is in the Managed-schema. Here, you need to make a note of the noun in SOLR:

Documents: The document we are working on will be converted to a document in DOCUMENTS,SOLR that corresponds to one of the documents we processed, which is a description of the document, which is physically the field collection.

Field: attributes, such as a document may contain the author, name, content, time and other attributes, field can be different data types, such as the author is a string, the time is long shaping. We can specify the data type of a field by field type, and how this type is participle and how it is indexed.

SOLR configures field and field type in the schema file with the name Managed-schema or Schema.xml, and one of my other articles has a brief introduction to the design of Shcema, which I will not dwell on here. In addition, if we are running Solrcloud (distributed index), there is no local file system for these two files, but we can see our configuration through the Web page.

The official website has introduced the change configuration, the schema provides the restful interface for us to modify the schema, but when our schema file changes, involves to the original file to rebuild the index question, has not yet the experiment.

In the installation package, there are several schema examples with the following structure:

Solrcloud start

The Solrcloud has a built-in zookeeper for storage configuration, master selection, etc., but it is best to re-download the installation zookeeper in case the Solrcloud node breaks down and affects the entire cluster.

III. SOLR Program Structure SOLR built-in JETTYSOLR program tracking Lucene main Class Iv. index technology word-document matrix inverted table

SOLR's summary document

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.