1. Preface
In the mind of deletion, the basic cognition is delete, subdivided into deleted documents (document) and delete index; To delete historical data, the basic cognition is: Delete the data of the given condition, use Delete_by_query.
Actual operation found:
-After you delete the document, the disk space does not decrease immediately, but it increases.
-There is no better way to do it than to +delete_by_query a timed task. 2. Common Delete Operations 2.1 delete a single document
Delete/twitter/_doc/1
2.2 Delete a document that satisfies a given condition
POST twitter/_delete_by_query
{"
query": {"
match": {
"message": ' Some message
}
}}
Note: Version conflicts may occur when a bulk deletion is performed. The deletion is enforced in the following ways:
POST twitter/_doc/_delete_by_query?conflicts=proceed
{"
query": {
"Match_all": {}
}
}
2.3 Delete a single index
Delete/twitter
2.4 Delete all indexes
DELETE/_all
Or
DELETE/*
Removing all indexes is a very risky operation, and be careful. 3, delete the background of what the document did.
To perform the returned results after the deletion:
{"
_index": "Test_index", "
_type": "Test_type",
"_id": "All",
"_version": 2, "Result
": "deleted" ,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_ term ":
}
Interpretation:
Each document of the index is versioned.
When you delete a document, you can specify a version to make sure that the related document that we are trying to delete is actually deleted and that no changes have been made during that period.
Each write that is performed on the document, including the deletion, will increase its version.
Real Time to delete:
Deleting a document doesn ' t immediately remove the document from disk; It just marks it as deleted. Elasticsearch'll clean up deleted documents in the background as your continue to index more data. 4. The difference between deleting an index and deleting a document.
1 The deletion of the index will immediately free space, there is no so-called "tag" logic.
2 When you delete a document, you write the new document and mark the old document as deleted. Whether or not the disk space is released depends on whether the old document is in the same segment file, so the segment merge in the ES background may trigger the physical deletion of the old document during the merging of segment file.
But because a shard may have hundreds of segment files, there is a great chance that new and old documents exist in different segment and cannot be physically deleted. To manually free up space, you can only do the force merge on a regular basis and set the max_num_segments to 1.
POST/_forcemerge
5, how to save only the last 100 days of data.
With the above understanding, the data tasks that are saved for nearly 100 days are decomposed into:
-1) Delete_by_query set to retrieve data for nearly 100 days;
2) Perform forcemerge operations and manually free disk space.
The deletion script is as follows:
#!/bin/sh
curl-h ' Content-type:application/json '-d ' {
"query": {"
range": {
"pt": {
"LT": " now-100d ",
format:" Epoch_millis "}}
}
'-xpost ' Http://192.168.1.101:9200/logstash _*/
_delete_by_query?conflicts=proceed "
The merge script is as follows:
#!/bin/sh
curl-xpost ' Http://192.168.1.101:9200/_forcemerge?
Only_expunge_deletes=true&max_num_segments=1 '
6, there is no more general method.
have, use ES official website tool--curator tool. 6.1 Curator Introduction
Main purpose: To plan and manage the index of ES. Supports common operations: Create, delete, merge, Reindex, snapshot, and so on. 6.2 Curator Website Address
Http://t.cn/RuwN0oM
git address: Https://github.com/elastic/curator 6.3 Curator Installation Wizard
Address: Http://t.cn/RuwCkBD
Attention:
Curator various blog tutorials are endless, but curator old version and the new version has a big difference, suggest reference to the latest official website manual deployment.
The old version of the command-line method is not supported by the new version. 6.4 Curator Command line operation
$ curator--help
usage:curator [OPTIONS] action_file
curator for Elasticsearch indices.
Http://elastic.co/guide/en/elasticsearch/client/curator/current
Options:
--config path path to Configuration file. Default: ~/.curator/curator.yml
--dry-run don't perform any changes.
--version Show the version and exit.
--help Show this and exit.
Core:
-profile CONFIG.YML: Configure the ES address, log configuration, log level, and so on to be connected; Execute file action.yml: Configure the action to be performed (batch), format of the configuration index (prefix matching, regular match, etc.) 6.5 curator applicable scenario
The most important thing is:
For example, delete operations only: Curator can easily delete the index of X days from the premise that the index naming follows a specific naming pattern-for example, a day-named index: logstash_2018.04.05.
The naming pattern needs to correspond to the timestring under the delete_indices in the action.yml. 7, summary reference to the latest official website documents, historical version of the historical document is easy to mislead people; more real practice, not limited to know; Medcl:es the new version 6.3 has an index lifecycle Management can easily manage the retention period for the indexes.
Reference:
[1]http://t.cn/ruwotv
[2]HTTP://T.CN/RUWXHBR
[3]http://t.cn/ruwoofc
2018-04-22 14:51 thinking about the bed in the home
Author: Ming Yi World
Reprint please indicate the source, the original address:
https://blog.csdn.net/laoyang360/article/details/80038930
If you feel this article to help you, please click ' Praise ' support, your support is I insist on writing the biggest motivation, thank you.