Recently in the monitoring of the container, encountered Influxdb this library, engaged for two days, a little understand some routines, make a record, memo ....
The summary is as follows:
Influxdb Go Language Writing
By default Influxdb creates a library that associates Autogen with the RP (Storage policy), that is, the data is retained permanently
Differences in monitoring and logging
Recently, monitoring is the monitoring service is the health of the body (still alive/sick? Are the indicators normal?)
Distinguish Log collection: Analysis The mental state of the service is health (a resume/diary of the Service)
How to do a monitoring
Ref: 1190000011082379
Recall that if you were to do a monitoring, you can record the CPU's idle rate per minute, how to do?
搞一个数据库, 用来放数据的 写一个脚本, 用来获取 CPU 的相关数据, 加上时间戳, 然后保存到数据库 创建一个定时任务, 一分钟运行一次脚本 写一个简单的程序, 从数据库查到数据, 然后根据时间戳, 绘制成图表.
Telegraf collector + influxdb (storage) + Grafana (show) Grafana routines are basically similar to Kibana, are based on the query conditions set aggregation rules, on the appropriate chart to display, a number of charts together to form a dashboard, familiar with Kibana users should be very easy to get started with. In addition, Grafana's visualization is much better than Kibana, and more than 4 versions will integrate alarm functions.
Grafana Host Monitoring:
Previous host monitoring with metricbeat-process level
Comparison of monitoring Influxdb vs General
Feature comparison
Reference: http://gitbook.cn/books/59395d3d5863cf478e6b50ba/index.html
influxdb集成已有的概念,比如查询语法类似sql,引擎从LSM优化而来,学习成本相对低。influxdb支持的类型有float,integers,strings,booleans,prometheus目前只支持float。influxdb的时间精度是纳秒,prometheus的则是毫秒。influxdb仅仅是个数据库,而prometheus提供的是整套监控解决方案,当然influxdb也提供了整套监控解决方案。influxdb支持的math function比较少,prometheus相对来说更多,influxdb就目前使用上已经满足功能。influxdb支持event log,prometheus不支持。
Note: Already on the comparison is the general V1, now the general has V2 version, heard than Influxdb more formidable. and the INFLUXDB cluster scheme is closed source.
Characteristics and characteristics of influxdb
Influxdb Chinese Translation of official documents, it feels great https://jasper-zhang1.gitbooks.io/influxdb/content/https://jasper-zhang1.gitbooks.io/ Influxdb/content/concepts/key_concepts.html
Reference: http://www.ttlsa.com/monitor-safe/monitor/distributed-time-series-database-influxdb/
- Influxdb its characteristics it has three main features:
1. Time Series (时间序列):你可以使用与时间有关的相关函数(如最大,最小,求和等)2. Metrics(度量):你可以实时对大量数据进行计算3. Eevents(事件):它支持任意的事件数据时序性(Time Series):与时间相关的函数的灵活使用(例如最大、最小、求和等);度量(Metrics):对实时大量数据进行计算;事件(Event):支持任意的事件数据,换句话说,任意事件的数据我们都可以做操作。
- Influxdb its features reference: http://dbaplus.cn/news-73-1291-1.html
schemaless(无结构),可以是任意数量的列无特殊依赖,几乎开箱即用(如ElasticSearch需要Java)自带数据过期功能;自带权限管理,精细到“表”级别;原生的HTTP支持,内置HTTP API强大的类SQL语法,支持min, max, sum, count, mean, median 等一系列函数,方便统计。
INFLUXDB Best Practices
1. Login to build a database query
Reference: https://jasper-zhang1.gitbooks.io/influxdb/content/Introduction/getting_start.html
influx -precision rfc3339 # -precision参数表明了任何返回的时间戳的格式和精度,针对查询时候显示的时间格式CREATE DATABASE mydbSHOW DATABASESUSE mydbINSERT cpu,host=serverA,region=us_west value=0.64SELECT "host", "region", "value" FROM "cpu"INSERT temperature,machine=unit42,type=assembly external=25,internal=37SELECT * FROM "temperature"> SELECT * FROM /.*/ LIMIT 1> SELECT * FROM "cpu_load_short"> SELECT * FROM "cpu_load_short" WHERE "value" > 0.9
2. Understanding INFLUXDB Basic Concepts
Reference: http://dbaplus.cn/news-73-1291-1.html
nouns in the influxdb |
Concepts in a traditional database |
Database |
Database |
Measurement |
Tables in the database |
Points |
A row of data inside the table |
Unique Concepts in the INFLUXDB
Point corresponds to a row of data in a traditional database, as shown in the following table: Point is made up of timestamp (time), Data (field), label (tags).
Line-protocol format
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]INSERT temperature,machine=unit42,type=assembly external=25,internal=37
More like:
cpu,host=serverA,region=us_west value=0.64payment,device=mobile,product=Notepad,method=credit billed=33,licenses=3i 1434067467100293230stock,symbol=AAPL bid=127.46,ask=127.48temperature,machine=unit42,type=assembly external=25,internal=37 1434067467000000000
Tag: 被索引上面的location和server就是tag key,us和host1是tag value,tag是可选的。不过写入数据时最好加上tag,因为它可以被索引。tag的类型只能是字符串。Field: value支持的类型floats,integers,strings,booleans上面的temperature是field key,82是field value。field value会用于展示,value支持的类型有floats,integers,strings,booleans。Timestamp格式是:RFC3339 UTC。默认精确到纳秒,可选。Series:measurement, tag set, retention policy相同的数据集合算做一个 series。理解这个概念至关重要,因为这些数据存储在内存中,如果series太多,会导致OOMRetention Policy:保留策略包括设置数据保存的时间以及在集群中的副本个数。默认配置是:RP是autogen,保留时间是永久,副本为1。这些配置在创建数据库时可以修改。Continuous Query:CQ是预先配置好的一些查询命令,定期自动执行这些命令并将查询结果写入指定的measurement中,这个功能主要用于数据聚合。具体参考:CQ。Shard:存储一定时间间隔的数据,每个目录对应一个shard,目录的名字就是shard id。每一个shard都有自己的cache、wal、tsm file以及compactor,目的就是通过时间来快速定位到要查询数据的相关资源,加速查询的过程,并且也让之后的批量删除数据的操作变得非常简单且高效。
2. The actual operation is as follows: Understanding Point&measurement&series (field set) (Indexed tag set)
Insert the following data into the library:
Property |
value |
library name |
my_database |
measurement |
Census |
field key |
butterflies and hone Ybees |
tag key |
location and scientist |
name: census-————————————time butterflies honeybees location scientist2015-08-18T00:00:00Z 12 23 1 langstroth2015-08-18T00:00:00Z 1 30 1 perpetua2015-08-18T00:06:00Z 11 28 1 langstroth2015-08-18T00:06:00Z 3 28 1 perpetua2015-08-18T05:54:00Z 2 11 2 langstroth2015-08-18T06:00:00Z 1 10 2 langstroth2015-08-18T06:06:00Z 8 23 2 perpetua2015-08-18T06:12:00Z 7 22 2 perpetua
The SQL statements are as follows
‘INSERT census,location=1,scientist=langstroth butterflies=12,honeybees=23‘‘INSERT census,location=1,scientist=perpetua butterflies=1,honeybees=30‘‘INSERT census,location=1,scientist=langstroth butterflies=11,honeybees=28‘‘INSERT census,location=1,scientist=perpetua butterflies=3,honeybees=28‘‘INSERT census,location=2,scientist=langstroth butterflies=2,honeybees=11‘‘INSERT census,location=2,scientist=langstroth butterflies=1,honeybees=10‘‘INSERT census,location=2,scientist=perpetua butterflies=8,honeybees=23‘‘INSERT census,location=2,scientist=perpetua butterflies=7,honeybees=22‘
- The 2 scripts used to build the data to simulate how often insert data when inserting data, randomly assign values
$ cat fake_data.sh arr=(‘INSERT orders,website=30 phone=10‘‘INSERT orders,website=39 phone=12‘‘INSERT orders,website=56 phone=11‘)#while :;dofor((i=0;i<${#arr[*]};i++));do /usr/bin/influx -database ‘my_food‘ -execute "${arr[i]}" sleep 10# echo "${arr[i]}"done#done
$ cat data.sh #!/bin/bashfunction rand(){ min=$1 max=$(($2-$min+1)) num=$(date +%s%N) echo $(($num%$max+$min))}while :;do /usr/bin/influx -database ‘my_database‘ -execute "INSERT census,location=2,scientist=perpetua butterflies=$(rand 1 50),honeybees=$(rand 1 50)" sleep 2;# echo "INSERT orders,website=$(rand 1 50) phone=$(rand 1 50)"# breakdone
field value is your data, which can be a string, a floating-point number, an integer, a Boolean value, because Influxdb is a time-series database, so field value is always associated with a timestamp. In the example, field value is as follows:
12 231 3011 283 282 111 108 237 22
In the above data, each set of field key and field value consists of the field set, and in the sample data, there are eight field set:
butterflies = 12 honeybees = 23butterflies = 1 honeybees = 30butterflies = 11 honeybees = 28butterflies = 3 honeybees = 28butterflies = 2 honeybees = 11butterflies = 1 honeybees = 10butterflies = 8 honeybees = 23butterflies = 7 honeybees = 22
Note that field is not indexed. If you query using field value as a filter, you must scan all values after other conditions match. As a result, these queries have a much lower performance relative to the query on tag (the query that describes tag below).
In the above data, the tag set is a different set of tag key and tag value, with four tag set in the sample data:
location = 1, scientist = langstrothlocation = 2, scientist = langstrothlocation = 1, scientist = perpetualocation = 2, scientist = perpetua
Now that you're familiar with Measurement,tag set and retention policy, it's time to talk about series. In Influxdb, Series is a collection of common retention policy,measurement and tag set. The above data consists of four series:
Understanding the series is necessary to design the data schema and to process the data inside the INFLUXDB. Finally, point is the field collection of the same series with the same timestamp. For example, this is a point:
name: census-----------------time butterflies honeybees location scientist2015-08-18T00:00:00Z 1 30 1 perpetua
The retention policy for the series in the example is autogen,measurement for Census,tag set as location = 1, scientist = Perpetua. The timestamp of point is 2015-08-18t00:00:00z.
Data sampling--Understanding CQ and RP
Continuous query (CQ) is a INFLUXQL query that automatically periodically runs within the database, CQS needs to use a function in the SELECT statement, and must include a group by Time () statement. +
Retention Policy (RP) is part of the INFLUXDB data schema, which describes when Influxdb saves data. Influxdb compares the local timestamp of the server with the timestamp of your data and removes older data than the one you set with duration in RPS. There can be multiple RPS in a single database, but RPS for each data is unique.
Instance data: Db:food_datamesurement:orders
name: orders------------time phone website2016-05-10T23:18:00Z 10 302016-05-10T23:18:10Z 12 392016-05-10T23:18:20Z 11 56
Goal:
自动删除1h以上的原始2秒间隔数据 --> rp实现自动删除超过5min的30s间隔数据 --> rp实现自动将2秒间隔数据聚合到30s的间隔数据 ---> cq实现
Insert data once in 2s: (script reference above fake data)
create databaes food_dataCREATE RETENTION POLICY "a_hour" ON "food_data" DURATION 1h REPLICATION 1 DEFAULTCREATE RETENTION POLICY "a_week" ON "food_data" DURATION 1w REPLICATION 1CREATE CONTINUOUS QUERY "cq_10s" ON "food_data" BEGIN SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone" INTO "a_week"."downsampled_orders" FROM "orders" GROUP BY time(10s) END
When you create a database in step 1, Influxdb automatically generates an RP called Autogen, and as the default Rp,autogen for the database, the RP retains the data forever. After entering the above command, A_hours will replace Autogen as the default RP for Food_data.
Verify:
select * from "a_week"."downsampled_orders";select * from "orders";
Influxdb Data Aggregation
Reference
表名都可以正则select * from /.*/ limit 1查询一个表里面的所有数据select * from cpu_idle查询数据大于200的。select * from response_times where value > 200查询数据里面含有下面字符串的。 select * from user_events where url_base = ‘friends#show’约等于 select line from log_lines where line =~ /[email protected]/按照30m分钟进行聚合,时间范围是大于昨天的 主机名是server1的。 select mean(value) from cpu_idle group by time(30m) where time > now() – 1d and hostName = ‘server1′select column_one from foo where time > now() – 1h limit 1000;select reqtime, url from web9999.httpd where reqtime > 2.5;select reqtime, url from web9999.httpd where time > now() – 1h limit 1000;url搜索里面含有login的字眼,还以login开头select reqtime, url from web9999.httpd where url =~ /^\/login\//;还可以做数据的mergeselect reqtime, url from web9999.httpd merge web0001.httpd;
Influxdb Backup Recovery
Reference: http://stedolan.github.io/jq/
#!/bin/bash function parse_options {function usage () {echo-e >&2 "usage: $ dump DATABASE [Options ...] \t-u username\t (default:root) \t-p password\t (default:root) \t-h host\t\t (default:localhost:8086) \t-s\t\t (use HTTPS) "} If [" $# "-lt 2]; then usage; Exit 1; Fi username=root password=root host=localhost:8086 https=0 shift database=$1 Shift while getopts U:p:h:s opts D o Case "${opts}" in U) username= "${optarg}";; p) password= "${optarg}";; h) host= "${optarg}";; s) Https=1;; ?) Usage Exit 1;; Esac done if ["${https}"-eq 1]; Then scheme= "https" Else Scheme= "http" fi} function dump {parse_options [email protected] curl-s-k-g " ${scheme}://${host}/db/${database}/series?u=${username}&p=${password}&chunked=true "--data-urlencode" q= SELECT * FROM/.*/"| JQ. -c-m Exit} function Restore {parse_options [email protected] while read-r line do echo >&2 "Writing ... "Curl-x post-d" [${Line}] "" ${scheme}://${host}/db/${database}/series?u=${username}&p=${password} "do exit} case" $ "in dump) d UMP [email protected];; Restore) Restore [email protected];; *) echo >&2 "Usage: $ [Dump|restore] ..." Exit 1;; Esac
Python calls Influxdb to implement data additions and deletions
utils/db.py
# - * - coding: utf-8 - * -from influxdb import InfluxDBClientdef get_db_connection(): db_conn = InfluxDBClient(host="192.168.x.x", database="pachongdb") return db_conn
main.py
#!/home/ansible/.venv/bin/python#-*-coding:utf-8-*-from influxdb.exceptions import influxdbclienterror, InfluxDBSe Rvererrorfrom utils Import Dbdef insert_success_point_2db (): Db_conn = db.get_db_connection () # Write Success record, success field value Convention is 1 Success_point = [{"Measurement": "Wake", "tags": {"ISP": "Mobile", "region": "Up Sea ",}," Fields ": {" mobile ": 159123456xx," Success ": 1,}}] Try: Db_conn.write_points (success_point) except Influxdbclienterror as E:print ("Influxdb DB Client Error: {0}". forma T (e)) except Influxdbservererror as E:print ("INFLUXDB DB Server Error: {0}". Format (e)) except Exception as E : Print ("Influxdb error: {0}". Format (e)) Finally:if Db_conn is not None:db_conn.close () def i Nsert_fail_point_2db (): Db_conn = db.get_db_connection () # Write failed record, fail field value contract is 0 fail_point = [{"Measurement ":" Wake "," tags ": {"ISP": "Mobile", "Region": "Shanghai",}, "Fields": {"mobile": 1591234xxxx, "Fail": 0,}}] Try:db_conn.write_points (fail_point) except Influxdbclienterror as E: Print ("Influxdb DB Client Error: {0}". Format (e)) except Influxdbservererror as E:print ("Influxdb db serve R error: {0} ". Format (e)) except Exception as E:print (" Influxdb error: {0} ". Format (e)) Finally:if db_ Conn is not None:db_conn.close () def main (): Insert_success_point_2db () insert_fail_point_2db () if __name_ _ = = ' __main__ ': Main ()
Requirements.txt
certifi==2017.11.5influxdb==5.0.0
[Svc]influxdb Best Practice-monitoring comparison