Easticsearch data migration to Influxdb python
Requirements: Easticsearch part of the data is migrated to Influxdb.
See from Mysql,influxdb migration to Easticsearch, not seen from easticsearch migration to Influxdb, the migrated data is some real-time traffic data, Influxdb sequential database on this kind of data support is more objective.
Solution: Fetch data from easticsearch in large quantities, two scenarios. 1.from...size 2.scroll (a database-like cursor) script uses the second scroll scheme to fetch data from the Easticsearch query. The loop is queried through scrool_id and written to Influxdb.
#!/usr/bin/env python#coding=utf-8import sysimport jsonimport datetimeimport elasticsearchfrom influxdb import influxdbclient# connecting Easticsearchclass es (object): @classmethod def connect_host (CLS): url = "http://192.168.121.33:9202/" es = elasticsearch. Elasticsearch (url,timeout=120) return eses = Es.connect_host () #连接influxdbclient = influxdbclient (host= "192.168.121.33", port= "8086", Username= ' admin ', password= ' admin ', database= ' ESL ') client.create_database (' ESL ') #DSL查询语法data = { "Query": { "Match_all" : {}}, "size" : 100}# sets the field to filter the returned field values. ' hits.hits._source.resource_id ',   &NBsp; ' Hits.hits._source.timestamp ', ' Hits.hits._source.counter_volume ', ' [email protected] ',]# specifies search_type= "Scan" mode and returns _scroll_id to Es.scroll to get data using res = es.search ( index= ' pipefilter_meters* ', doc_type = ' Canaledge.flow.bytes ', body=data, search_type= "Scan", scroll= "10m") scroll_id = res[' _scroll_id ']response= es.scroll (scroll_id= scroll_id, scroll= "10m", Filter_path=return_fields,) scroll_id = response[' _scroll_id '] #获取第二次scroll_idhits = response[' hits ' [' Hits '] in_data = []while len (hits) > 0: for i in hits: res_id = i[' _source ' [' resource_id '] r_id, r_type =&Nbsp;res_id.split (': ') datas = { "Measurement": "Es_net", "tags": { "resource_id": r_id, "Type": r_type }, "Time": i[' _source ' [' Timestamp '], "Fields": { "Counter_volume": i[' _source ' [' Counter_volume '] } } in_data.append (datas) #循环写入influxdb client.write_ Points (in_data) in_data = [] #每次循环完重新定义列表为空 data = { "Query": { "Match_all" : {}}, "size": 100 } ## set What field to filter the returned field values to. ' _scroll_id ', ' hits.hits._source.resource_id ', ' Hits.hits._source.timestamp ', ' Hits.hits._source.counter_volume ', ' [email protected] ', ] ## Specify search_type= "Scan" mode and return _scroll_id to Es.scroll for data usage response= es.scroll (scroll_id=scroll_id, scroll= "10m", Filter_path=return_fields,) #调试 #if not response.get (' hits '): # print response # sys.exit (1) #else: hits = response[' hits ' [' Hits '] scroll_id = response["_scroll_id"] #获取第三次scroll_id
This article from "Rusty Old Gun _ Technology blog" blog, reproduced please contact the author!
Easticsearch Data migration to Influxdb "Python"