Elasticsearch disk space full of fault handling ideas

Source: Internet
Author: User

Yesterday received a customer complaint that ES data directory has reached 100%, at that time, the first idea Monitoring group how to monitor the 80% alarm Why did not monitor out, the monitoring group was a few days ago received the alarm, and ignored, I was that crash, and have the impulse to curse, this is the current network, not the test environment, I'm drunk, even if I have such a irresponsible colleague. I log on to the server to see the worst disk space 100%, the other several 99%, just set up two months of the cluster appeared insufficient space, then said there is no such a large amount of data, I dug myself a big pit, the data exist on a disk.


Two scenarios that could be thought of at the time:


Scheme one of the other disk to do a large LVM volume group, then make a large disk, the benefits of easy to handle, but the shortcomings will result in centralized data storage, will affect the subsequent insert performance and query performance, then this scheme as a last resort.

Scenario Two: Configure multiple disks by configuring Path.logs, but because of the knowledge of the ES is not very deep, tight will build and simple use, this program in my understanding of the level should be feasible, but the customer said to give feasible evidence, but only in the construction of an ES cluster test, listing the evidence. (eventually solved by the scheme)

1, stop the top application, ES cluster and stop scheduled 2, back up the existing ES data to the new disk (1 hours 500G)---in order to rollback to prepare 3, modify the ES profile, path.logs Configure multiple disks 4, ES automatically sync data files to other disk space (1 1 hours complete)


Test steps
Test 1: Only start one node, the ES configuration file configuration path only write a directory/data1, add data to Es, ES data only exists on the node specified directory
Client adds an index
Curl-xput ' Datanode10:9200/customer/external/1?pretty '-d '
{
"Name": "John Doe"
}'
The data seen through the head plugin is as follows:

The conclusion is: The fragment is on a machine, and the fragment state is unassigned

Test 2: Only start a node, modify the ES profile in the data directory bit two disk/data01,/DATA02, boot Es,es data will automatically sync two data directory, after adding a new index, ES will automatically balance 2 data directory.

Figure or above, except that the data is fragmented and stored on two machines.

Test Datanode10 put two directory/data01,/data02 directory: Look at the following directory structure remember clearly

[Root@datanode10 nodes]# tree/data01/es5/data/nodes/0/indices/

/data01/es5/data/nodes/0/indices/

├──_2nxgrdbqwqetalngh_xaq


│├──0

││├──index

│││├──segments_2

│││└──write.lock

││├──_state

│││└──state-1.st

││└──translog

││├──translog-1.ckp

││├──translog-1.tlog

││├──translog-2.tlog

││└──translog.ckp

│├──1

││├──index

│││├──segments_2

│││└──write.lock

││├──_state

│││└──state-1.st

││└──translog

││├──translog-1.ckp

││├──translog-1.tlog

││├──translog-2.tlog

││└──translog.ckp

│├──2

││├──index

│││├──segments_2

│││└──write.lock

││├──_state

│││└──state-1.st

││└──translog

││├──translog-1.ckp

││├──translog-1.tlog

││├──translog-2.tlog

││└──translog.ckp

│├──3

││├──index

│││├──_0.cfe

│││├──_0.cfs

│││├──_0.si

│││├──segments_4

│││└──write.lock

││├──_state

│││└──state-1.st

││└──translog

││├──translog-2.ckp

││├──translog-2.tlog

││├──translog-3.tlog

││└──translog.ckp

│├──4

││├──index

│││├──segments_2

│││└──write.lock

││├──_state

│││└──state-1.st

││└──translog

││├──translog-1.ckp

││├──translog-1.tlog

││├──translog-2.tlog

││└──translog.ckp

│└──_state

│└──state-12.st

└──zytkimacr4ckb53vg4xynw

├──0

│├──index

││├──segments_2

││└──write.lock

│├──_state

││└──state-0.st

│└──translog

│├──translog-1.tlog

│└──translog.ckp

├──3

│├──index

││├──_0.cfe

││├──_0.cfs

││├──_0.si

││├──segments_1

││└──write.lock

│├──_state

││└──state-0.st

│└──translog

│├──translog-1.tlog

│└──translog.ckp

└──_state

└──state-7.st

directories, files

[Root@datanode10 nodes]#

[Root@datanode10 nodes]# tree/data02/es5/data/nodes/0/indices/

/data02/es5/data/nodes/0/indices/

├──_2nxgrdbqwqetalngh_xaq

│└──_state

│└──state-12.st

└──zytkimacr4ckb53vg4xynw

├──1

│├──index

││├──segments_1

││└──write.lock

│├──_state

││└──state-0.st

│└──translog

│├──translog-1.tlog

│└──translog.ckp

├──2

│├──index

││├──segments_2

││└──write.lock

│├──_state

││└──state-0.st

│└──translog

│├──translog-1.tlog

│└──translog.ckp

├──4

│├──index

││├──segments_2

││└──write.lock

│├──_state

││└──state-0.st

│└──translog

│├──translog-1.tlog

│└──translog.ckp

└──_state

└──state-7.st

directories, files


Test 3: Start 2 nodes, the previous ES profile/data2 removed, after adding ES machine to modify the data directory in the ES profile/data1, boot es, before the ES two indexes will automatically sync two machines under the/data1.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.