Elasticsearch__python based on Python operations

Last Update:2018-07-24 Source: Internet

Author: User

Tags bulk insert unique id

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Environmental dependency:

python:2.7
ES-dependent Packages: Pyelasticsearch
elasticsearch:5.5.1/6.0.1
Operating system: Windows 10/centos 7

This article mainly on the ES basic crud operation to do to generalize, ES official to Python relies on the support to have many, Eg:pyelasticsearch, Esclient, Elasticutils, Pyes, Rawes, Surfiki refine and so on. Blogger in the work only involved in the Pyelasticsearch, so this article mainly on the reliance to do the explanation, other dependencies can be detailed in the official website.
Pyelasticsearch Dependency Pack installation command: Pip install Elasticsearch

Pyelasticsearch rely on the interface provided is not a lot, the following mainly from the single operation and bulk operation of the two categories for discussion and analysis. Single Operation

Insert
Create: You must specify the Idnex, type, ID, and query body to be queried.
Index: More flexible than create,index usage; ID is not a required option, and if specified, the ID of the document is the specified value, and if not specified, a globally unique ID is automatically generated to assign to the document.
eg

BODY = {"name": ' Lucy ', ' sex ': ' Female ', ' age ':}
es = Elasticsearch ([' localhost:9200 '])
es.index (index= ' IndexName ', doc_type= ' typeName ', Body, Id=none)

Remove
Delete: Delete document with specified index, type, ID

Es.delete (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ')

Find
Get: Gets the document that corresponds to the specified index, type, ID

Es.get (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ')

Update
Update: Document corresponding to the new specified index, type, ID
　　

Es.update (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ', body={to update fields})

Bulk Operations

Conditional Query
Search: Query All documents that meet the criteria, no id attribute, and index,type and body can be none.
The body's syntax format must conform to the DSL (Domain specific Language) format

query = {' query ': {' Match_all ': {}}}# Find all documents

query = {' query ': {' term ': {' name ': ' Jack '}}}# find all documents named Jack,

query = {' query ': {' range ': {' age ': {' GT ': 11}}}}# Find all documents older than 11

Alldoc = Es.search (index= ' indexname ', doc_type= ' TypeName ', body=query

print alldoc[' hits ' [' Hits '][0]# returns the contents of the first document

Conditional Deletion
Delete_by_query: Delete all data that satisfies the condition, the query condition must conform to the DLS format

query = {' query ': {' match ': {' sex ': ' famale '}}}# delete all documents of sex for women

query = {' query ': {' range ': {' age ': {' lt ': 11}}}}# delete a young All documents in 11

es.delete_by_query (index= ' IndexName ', body=query, doc_type= ' TypeName ')

Conditional Update
Update_by_query: Update all the data that satisfies the condition, the same as delete and query

BULK INSERT, delete, update
Bulk: In this focus and everyone talk about bulk method, all the previous methods are very simple, but this bulk when the author began to contact, spent a lot of time; This method can perform multiple operations at the same time. Single request once, thus in bulk operation, can greatly reduce the program system overhead. In addition, bulk can not only perform inserts, or deletes, in batches at a time, but can insert, delete, and update operations in one request.
However, it should be noted that any operation has a fixed document format that succeeds only if it fully conforms to the format requirement. Nonsense not much to say, directly on the code:

 doc = [{"index": {}}, {' name ': ' Jackaaa ', ' age ': +, ' sex ': ' Female ', ' address ': U '
     Beijing '}, {"index": {}}, {' name ': ' jackbbb ', ' age ': 3000, ' sex ': ' Male ', ' address ': U ' Shanghai '}, {"index": {}}, {' name ': ' JACKCCC ', ' age ': 4000, ' sex ': ' Female ', ' address ': U ' guangzhou '}, {' index ': {}}, {' name ': ' jackddd ', ' age ': 1000, ' sex ': ' Male ', ' address ': U ' shenzhen '},] doc = [{' index ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' IdV '  Alue '} {' name ': ' Jack ', ' sex ': ' Male ', ' age ': ' {' delete ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' create ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' name ': ' Lucy ', ' Sex ': ' Female ', ' age ': {' update ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' doc ': {' age ': ' M '}] es.bulk (index= ' indexname ', doc_type= ' typeName ', Body=doc)

Through the above two examples can be seen in the batch operation with bulk, for different types of operation, must be corresponding to the operation of a header information (eg:{"index": {}}, {' delete ': {...}}, ...} ), otherwise it will report Transporterror (U ' illegal_argument_exception ') error.
Here, in the actual process, many times will be here to the special batch of such a dictionary array. Suppose you have the following scenario:
If you want to bulk insert a batch of data, as in the first example above, it is easy to think of a workaround on the basis of an existing dataset: quickly implement the required dictionary array by means of the odd-even merge of the list. A python tip is recommended here: [:: 2] and [1::2] to implement a parity merge. Details can be described in my blog: Python programming tips.

A complete example of this article can be described in my GitHub
　

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More