Elasticsearch__python based on Python operations

Source: Internet
Author: User
Tags bulk insert unique id
Environmental dependency:

python:2.7
ES-dependent Packages: Pyelasticsearch
elasticsearch:5.5.1/6.0.1
Operating system: Windows 10/centos 7

This article mainly on the ES basic crud operation to do to generalize, ES official to Python relies on the support to have many, Eg:pyelasticsearch, Esclient, Elasticutils, Pyes, Rawes, Surfiki refine and so on. Blogger in the work only involved in the Pyelasticsearch, so this article mainly on the reliance to do the explanation, other dependencies can be detailed in the official website.
Pyelasticsearch Dependency Pack installation command: Pip install Elasticsearch

Pyelasticsearch rely on the interface provided is not a lot, the following mainly from the single operation and bulk operation of the two categories for discussion and analysis. Single Operation

Insert
Create: You must specify the Idnex, type, ID, and query body to be queried.
Index: More flexible than create,index usage; ID is not a required option, and if specified, the ID of the document is the specified value, and if not specified, a globally unique ID is automatically generated to assign to the document.
eg

BODY = {"name": ' Lucy ', ' sex ': ' Female ', ' age ':}
es = Elasticsearch ([' localhost:9200 '])
es.index (index= ' IndexName ', doc_type= ' typeName ', Body, Id=none)

Remove
Delete: Delete document with specified index, type, ID

Es.delete (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ')

Find
Get: Gets the document that corresponds to the specified index, type, ID

Es.get (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ')

Update
Update: Document corresponding to the new specified index, type, ID
  

Es.update (index= ' indexname ', doc_type= ' typeName ', id= ' idvalue ', body={to update fields})
Bulk Operations

Conditional Query
Search: Query All documents that meet the criteria, no id attribute, and index,type and body can be none.
The body's syntax format must conform to the DSL (Domain specific Language) format

query = {' query ': {' Match_all ': {}}}# Find all documents

query = {' query ': {' term ': {' name ': ' Jack '}}}# find all documents named Jack,

query = {' query ': {' range ': {' age ': {' GT ': 11}}}}# Find all documents older than 11

Alldoc = Es.search (index= ' indexname ', doc_type= ' TypeName ', body=query

print alldoc[' hits ' [' Hits '][0]# returns the contents of the first document

Conditional Deletion
Delete_by_query: Delete all data that satisfies the condition, the query condition must conform to the DLS format

query = {' query ': {' match ': {' sex ': ' famale '}}}# delete all documents of sex for women

query = {' query ': {' range ': {' age ': {' lt ': 11}}}}# delete a young All documents in 11

es.delete_by_query (index= ' IndexName ', body=query, doc_type= ' TypeName ')

Conditional Update
Update_by_query: Update all the data that satisfies the condition, the same as delete and query

BULK INSERT, delete, update
Bulk: In this focus and everyone talk about bulk method, all the previous methods are very simple, but this bulk when the author began to contact, spent a lot of time; This method can perform multiple operations at the same time. Single request once, thus in bulk operation, can greatly reduce the program system overhead. In addition, bulk can not only perform inserts, or deletes, in batches at a time, but can insert, delete, and update operations in one request.
However, it should be noted that any operation has a fixed document format that succeeds only if it fully conforms to the format requirement. Nonsense not much to say, directly on the code:

 doc = [{"index": {}}, {' name ': ' Jackaaa ', ' age ': +, ' sex ': ' Female ', ' address ': U '
     Beijing '}, {"index": {}}, {' name ': ' jackbbb ', ' age ': 3000, ' sex ': ' Male ', ' address ': U ' Shanghai '}, {"index": {}}, {' name ': ' JACKCCC ', ' age ': 4000, ' sex ': ' Female ', ' address ': U ' guangzhou '}, {' index ': {}}, {' name ': ' jackddd ', ' age ': 1000, ' sex ': ' Male ', ' address ': U ' shenzhen '},] doc = [{' index ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' IdV '  Alue '} {' name ': ' Jack ', ' sex ': ' Male ', ' age ': ' {' delete ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' create ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' name ': ' Lucy ', ' Sex ': ' Female ', ' age ': {' update ': {' _index ': ' IndexName ', ' _type ': ' TypeName ', ' _id ': ' Idvalue '} {' doc ': {' age ': ' M '}] es.bulk (index= ' indexname ', doc_type= ' typeName ', Body=doc) 

Through the above two examples can be seen in the batch operation with bulk, for different types of operation, must be corresponding to the operation of a header information (eg:{"index": {}}, {' delete ': {...}}, ...} ), otherwise it will report Transporterror (U ' illegal_argument_exception ') error.
Here, in the actual process, many times will be here to the special batch of such a dictionary array. Suppose you have the following scenario:
If you want to bulk insert a batch of data, as in the first example above, it is easy to think of a workaround on the basis of an existing dataset: quickly implement the required dictionary array by means of the odd-even merge of the list. A python tip is recommended here: [:: 2] and [1::2] to implement a parity merge. Details can be described in my blog: Python programming tips.

A complete example of this article can be described in my GitHub
 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.