Backgroundat present, the storage and processing of time-series Big data is processed by relational database, but because of the inherent disadvantage of relational database, it can't make efficient storage and data query. Time series Big Data solutions by using special storage methods, it is an important technology to solve mass data processing that large data can be efficiently stored and processed quickly. This technology uses special data storage method, greatly improves the processing ability of time-related data, and its storage space is halved compared to relational database, and the query speed is greatly improved. The time-series function has superior query performance over relational databases, and Informix TimeSeries is ideal for IoT analytics applications. Defined
Time series database is mainly used for processing data with time label (change in time, that is, time serialization), data with time tag is also called time series data.
Latest Time Series Database rankings:
Features & classification:
- Specifically optimized for processing time series data
- This type of data is sorted by time
- Because the data is typically large (so sharding and scale are important) or logically complex (large aggregation, fetch, drill down), relational databases are often difficult to handle
- Time series data are divided into two categories by attributes
- High frequency low retention period (data acquisition, real-time display)
- Low frequency high retention period (data presentation, analysis)
- Rule interval (data acquisition)
- Irregular interval (event-driven)
- Several prerequisites for time series data
- A single piece of data doesn't matter
- Data is rarely updated, or deleted (only when expired data is deleted), and new data is the most recent data by time
- The same data appears multiple times and is considered to be the same piece of data
Critical comparison of time series database
InfluxDB |
ElasticSearch |
Popular (Tsdb ranked first) |
Popular (search engine ranking first) |
High availability requires a fee |
Cluster high availability easy to implement, free |
High single-point write performance |
Low single-point write performance |
Simple query syntax, strong function |
Simple query syntax, strong function (weaker than INFLUXDB) |
Back-end Timing database design, write fast |
Design is not a time-series database, back-end storage with document structure, slow writing |
This shows: high frequency low retention period with influxdb, low-frequency high retention period with ES.
Other time Series Database introduction:
How to usequerying and writing of data:
- Influxdb and Es are REST API-style interfaces
- Write data via HTTP POST, get data via HTTP GET, ES with HTTP put and delete, etc.
- Write data can be in JSON format, INFLUXDB support line Protocol
- JSON format to increase the cost of parsing, input data format as simple as possible.
- Usually es paired with Logstash, influxdb with Telegraf
take influxdb, for example, to see how to insert and query data:
HTTP API for Influxdb
Create db
[Email protected] ~]# curl-i-xpost http://192.168.32.31:8086/query--data-urlencode "q=create DATABASE mydb"http/1.1 $okconnection:closecontent-type:application/jsonrequest-id:42a1f30c-5900-11e6-8003-000000000000X-influxdb-version:0.13.0Date:tue, GenevaThe . A: -: -gmtcontent-length: -{"Results": [{}]}[[email protected] ~]#
Write Data
[Email protected] ~]# curl-i-xpost http://192.168.32.31:8086/query--data-urlencode "q=create DATABASE mydb"http/1.1 $okconnection:closecontent-type:application/jsonrequest-id:42a1f30c-5900-11e6-8003-000000000000X-influxdb-version:0.13.0Date:tue, GenevaThe . A: -: -gmtcontent-length: -{"Results": [{}]}[[email protected] ~]#
Querying the data that is written
[Email protected] ~]# Curl-get'http://192.168.32.31:8086/query?pretty=true'--data-urlencode"Db=mydb"--data-urlencode"q=select \ "value\" from \ "cpu_load_short\" WHERE \ "region\" = ' us-west '"{ "Results": [ { "Series": [ { "name":"Cpu_load_short", "Columns": [ " Time", "value" ], "Values": [ [ "2015-06-11t20:46:02z", 0.64]]} ]}]}[[email protected]~]#
Introduction Telegraf&logstash:
describes the schemas used by the two time series databases:
1. Log capture, then deposit influxdb, and finally make visual queries in Grafana.
2. Database monitoring, mainly through the collection of relational database performance indicators analysis of the operational status of the database for easy monitoring and management, as shown in
Visualization of data
The visualization of data has a lot of choices, such as the recommended use of Kibana in elk, with ES more convenient, and with INFLUXDB can use Grafana.
Currently Grafana supports data sources
–es
–influxdb
–prometheus
–graphite
–opentsdb
–cloudwatch
Installing Grafana
The installation of Grafana is simple, with Debian installation as an example:
Execute command: $ wget https: // Grafanarel.s3.amazonaws.com/builds/grafana_2.6.0_amd64.deb $ sudo apt-get Install--i grafana_2. 6 . 0_amd64.deb Start server: sudo service grafana-server start
The configuration can then be used to visualize the data. This is not the start of the talk. There will be separate articles about Grafana and Kibana.
Summarize
This article briefly summarizes the contents of the time series database, introduces the characteristics and compares the differences with traditional databases with Influxdb, and how to use Influxdb. Finally, we explain the use of Time series database architecture, logging and monitoring, through the Grafana for Visual data query analysis monitoring and so on. Article Address https://www.cnblogs.com/wenBlog/p/8297100.html
Time Series Database Overview