SOLR Study 2

Source: Internet
Author: User
Tags time zones solr

The time problem in 1:SOLR
The time shown in SOLR defaults to eight hours less than our native time, because time zones are different.

viewing in SOLR's Web page will find eight hours less time.
However, the use of Java code is the full-time operation, so in this only need to know sorl this phenomenon can be.


You can add a default value to this time field. Add the Default field to
Adds a default value to the Last_modified field, which is the current time.
<field name= "last_modified" type= "date" indexed= "true" stored= "true" default= "Now"/>


Master-Slave in 2:SOLR
It mainly solves the problem of high concurrency, and can use the Ngix agent to achieve load balancing.
The master-slave structure can have only one primary node, and multiple slave nodes may be available.
And can only add data in the main node implementation, from the node generally perform query operations, from the node will synchronize data from the primary node timing.

Manually set master-slave nodes:

Specific configuration:
Master node: 192.168.1.170
Vi/usr/loca/solr-4.10.4/example/solr/collection1/conf/solrconfig.xml
Find the 1240th line and remove the configuration comment
<lst name= "Master" >
The value of <!--Replicateafter indicates that the slave node can synchronize to data only after the master node commits and starts. -
<str name= "Replicateafter" >commit</str>
<str name= "Replicateafter" >startup</str>
<str name= "Conffiles" >schema.xml,stopwords.txt</str>
</lst>

From node: 192.168.1.171
Vi/usr/loca/solr-4.10.4/example/solr/collection1/conf/solrconfig.xml
Find the 1240th line, remove this configuration comment, and modify the configuration Your-master-hostname
<lst name= "Slave" >
<!--Specifies the primary node address, which is specified by default, and if it is a different name, it needs to be explicitly specified----Collection1
<str name= "MasterUrl" >http://192.168.1.170:8983/solr</str>
<!--means the timing of synchronization--
<str name= "PollInterval" >00:00:60</str>
</lst>


Start the master node and start the slave node. Add data at the host point and you can find it from the node.


Dynamically set master/slave:
Vi/usr/loca/solr-4.10.4/example/solr/collection1/conf/solrconfig.xml
Find 1240 rows and change to the following configuration
<lst name= "${master:master}" >
<str name= "Replicateafter" >commit</str>
<str name= "Replicateafter" >startup</str>
<str name= "Conffiles" >schema.xml,stopwords.txt</str>
</lst>

<lst name= "${slave:slave}" >
<str name= "MasterUrl" >http://${masterurl}:8983/solr</str>
<str name= "PollInterval" >00:00:60</str>
</lst>

Make a copy of the solr-4.10.4 catalogue.

Start the master node on 192.168.1.170
Java-dslave=disabled-dmasterurl= ""-jar Start.jar
Starting the slave node on the 192.168.1.171
Java-dmaster=disabled-dmasterurl=192.168.1.170-jar Start.jar


3:SOLR master-slave replication process
The realization of Replication

Master is not aware of the existence of Slave, Slave periodically polls master to see the current index version. If Slave discovers a new version, Slave initiates the replication process. The steps are as follows:
1. Slave issue a filelist command to collect a list of files. This command returns a series of metadata (size, lastmodified, and so on)
2. Slave see if it has these files locally, then it will start downloading the missing files (using command filecontent). If the connection fails, the download terminates. It will retry 5 times and discard if it still fails.
3. The file was downloaded to a temporary directory. Therefore, errors in the middle of the download do not affect slave.
4. A commit command is executed by Replicationhandler, and the new index is loaded in


4:solrcloud
Why does it appear solrcloud
1): High concurrency problem
2): Too many index data causes a single node to be unable to store

The configuration of SOLR
Solrcloud relies on zookeeper, so it needs to be configured and started, assuming zookeeper starts on 192.168.1.170
The cluster planning is as follows:
The Solrcloud contains two shards, each of which is two nodes, one from the master.
So four SOLR services are required
Here we launch 4 SOLR on one node to simulate

Specific steps
It is recommended to re-unzip a SOLR.
CD solr-4.10.4
cp-r Example Node1
cp-r Example Node2
cp-r Example Node3
cp-r Example Node4


CD Node1
Java java-dzkhost=192.168.1.170:2181-dnumshards=2-dbootstrap_confdir=./solr/collection1/conf- Dcollection.configname=myconf-jar Start.jar
Note: This start command specifies the address of the zookeeper, the number of shards, and the configuration information.
When you start another node, you don't need to specify the number of shards and configuration information.

CD Node2
Java-dzkhost=192.168.1.170:2181-djetty.port=7574-jar Start.jar


CD Node3
Java-dzkhost=192.168.1.170:2181-djetty.port=8984-jar Start.jar

CD NODE4
Java-dzkhost=192.168.1.170:2181-djetty.port=7575-jar Start.jar

Once this is done, you can see the Solrcloud cluster and structure in SOLR.
Http://192.168.1.170:8983/solr/#/~cloud

-djetty.port: Specify port for jetty
-dzkhost: Specify the address of the zookeeper
-dnumshards=2: Number of shards
-dbootstrap_confdir=./solr/collection1/conf: Uploading Configuration files
-dcollection.configname=myconf: A name for the configuration file



Note: If you find that the cluster authorities in the cloud and the inconsistencies specified in our commands are not the same, the previous configuration information is saved in zookeeper.
Therefore, you need to delete the previous configuration information, and then re-execute the command to create the cluster.
zookeeper/bin/zkcli.sh

Rmr/clusterstate.json



5: Use Java code to manipulate Solrcloud
You cannot specify which node to use here
You need to specify the address of zookeeper
The code is as follows
String zkhost = "192.168.1.170:2181,192.168.1.171:2181";
Cloudsolrserver Server = new Cloudsolrserver (zkhost);
You must know the name of the default collection
Server.setdefaultcollection ("Collection1");
Solrquery query = new Solrquery ();
Query.set ("Q", "*:*");
Queryresponse response = server.query (query);
Solrdocumentlist results = Response.getresults ();


If the nodes of one of the shards in the cluster are hung, in order for the cluster to be able to provide the query service normally, you need to add the following code
Solrquery.setdistrib (false);//indicates that no distributed search is used

——————————————————————————————————————————————————————————————————

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

——————————————————————————————————————————————————————————————————

How to submit 1:SOLR
Soft commit: The data is submitted to the memory, can be searched,
Use this method every time you submit a small amount of data
Server.commit (true,true,true);
Hard commit: The data is submitted to disk and can be searched.
Use when bulk commit
Server.commit

Auto Commit
Automatic Soft commit: Default is not turned on
<autoSoftCommit>
<maxTime>1000</maxTime>
</autoSoftCommit>
Automatic hard commit: Default on, default 15 second execution
<autoCommit>
<maxDocs>1000</maxDocs>
<maxTime>15000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>

Note: Although the soft commit is to commit the data to memory, but SOLR stops, the data is not lost,
Since SOLR records the log without submitting a single piece of data, SOLR later replies to the data from the log.

Detailed explanations of specific automated submissions can be found in the 1th of SOLR optimization


2:SOLR Performance Optimization
Detailed explanations of specific automated submissions can be consulted in the SOLR optimization
1: Configuration for auto-commit
2: Configuration for the store and indexed properties of the field
3:optimize features, note: Although optimizations can improve query speed, they are not recommended for frequent execution,
Because it consumes performance, it is recommended to do it regularly.
4:JVM Memory settings
5: Try to put SOLR and applications on the same server.
6: Try to use a higher log output level.

3:SOLR and HBase Integration
Both of them are integrated, primarily using SOLR's query advantages and HBase's feature of fast query via row.

The concrete project uses the "Solr+hbase original" which I provide, on this basis to carry on the transformation, the main Solrutils tool class two methods

Specific pre-SOLR configuration Reference "Solr+hbase integrated. txt"

Finally, the code completed during the class will be packaged as "solr+hbase complete"

SOLR Learning 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.