Time Series Database Overview

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Backgroundat present, the storage and processing of time-series Big data is processed by relational database, but because of the inherent disadvantage of relational database, it can't make efficient storage and data query. Time series Big Data solutions by using special storage methods, it is an important technology to solve mass data processing that large data can be efficiently stored and processed quickly. This technology uses special data storage method, greatly improves the processing ability of time-related data, and its storage space is halved compared to relational database, and the query speed is greatly improved. The time-series function has superior query performance over relational databases, and Informix TimeSeries is ideal for IoT analytics applications. Defined

　Time series database is mainly used for processing data with time label (change in time, that is, time serialization), data with time tag is also called time series data.

Latest Time Series Database rankings:

Features & classification:

Specifically optimized for processing time series data

This type of data is sorted by time
Because the data is typically large (so sharding and scale are important) or logically complex (large aggregation, fetch, drill down), relational databases are often difficult to handle

Time series data are divided into two categories by attributes

High frequency low retention period (data acquisition, real-time display)
Low frequency high retention period (data presentation, analysis)

by frequency

Rule interval (data acquisition)
Irregular interval (event-driven)

Several prerequisites for time series data

A single piece of data doesn't matter
Data is rarely updated, or deleted (only when expired data is deleted), and new data is the most recent data by time
The same data appears multiple times and is considered to be the same piece of data

Critical comparison of time series database

InfluxDB	ElasticSearch
Popular (Tsdb ranked first)	Popular (search engine ranking first)
High availability requires a fee	Cluster high availability easy to implement, free
High single-point write performance	Low single-point write performance
Simple query syntax, strong function	Simple query syntax, strong function (weaker than INFLUXDB)
Back-end Timing database design, write fast	Design is not a time-series database, back-end storage with document structure, slow writing

This shows: high frequency low retention period with influxdb, low-frequency high retention period with ES.

Other time Series Database introduction:

How to usequerying and writing of data:

Influxdb and Es are REST API-style interfaces
Write data via HTTP POST, get data via HTTP GET, ES with HTTP put and delete, etc.
Write data can be in JSON format, INFLUXDB support line Protocol
JSON format to increase the cost of parsing, input data format as simple as possible.
Usually es paired with Logstash, influxdb with Telegraf

take influxdb, for example, to see how to insert and query data:

HTTP API for Influxdb

Create db

[Email protected] ~]# curl-i-xpost http://192.168.32.31:8086/query--data-urlencode "q=create DATABASE mydb"http/1.1  $okconnection:closecontent-type:application/jsonrequest-id:42a1f30c-5900-11e6-8003-000000000000X-influxdb-version:0.13.0Date:tue, GenevaThe .  A: -: -gmtcontent-length: -{"Results": [{}]}[[email protected] ~]#

Write Data

[Email protected] ~]# curl-i-xpost http://192.168.32.31:8086/query--data-urlencode "q=create DATABASE mydb"http/1.1  $okconnection:closecontent-type:application/jsonrequest-id:42a1f30c-5900-11e6-8003-000000000000X-influxdb-version:0.13.0Date:tue, GenevaThe .  A: -: -gmtcontent-length: -{"Results": [{}]}[[email protected] ~]#

Querying the data that is written

[Email protected] ~]# Curl-get'http://192.168.32.31:8086/query?pretty=true'--data-urlencode"Db=mydb"--data-urlencode"q=select \ "value\" from \ "cpu_load_short\" WHERE \ "region\" = ' us-west '"{    "Results": [        {            "Series": [                {                    "name":"Cpu_load_short",                    "Columns": [                        " Time",                        "value"                    ],                    "Values": [                        [                            "2015-06-11t20:46:02z",                            0.64]]} ]}]}[[email protected]~]#

Introduction Telegraf&logstash:

are both data collection and transit tools, architecture is plug-in configuration
Telegraf light weight compared to Logstash
Supports a large number of sources, including relational databases, NOSQL, direct collection of operating system information (Linux, Win), Apps, Services (Docker)
The execution mode is divided into two types
Active: Read the collected data once according to the configuration, close the process after the collection is complete
Passive: As a process-resident memory, listening on a specific port, waiting for a message to be sent

describes the schemas used by the two time series databases:

1. Log capture, then deposit influxdb, and finally make visual queries in Grafana.

2. Database monitoring, mainly through the collection of relational database performance indicators analysis of the operational status of the database for easy monitoring and management, as shown in

Visualization of data

The visualization of data has a lot of choices, such as the recommended use of Kibana in elk, with ES more convenient, and with INFLUXDB can use Grafana.

Currently Grafana supports data sources

–es

–influxdb

–prometheus

–graphite

–opentsdb

–cloudwatch

Installing Grafana

The installation of Grafana is simple, with Debian installation as an example:

Execute command: $ wget https: // Grafanarel.s3.amazonaws.com/builds/grafana_2.6.0_amd64.deb $ sudo apt-get Install--i grafana_2. 6 . 0_amd64.deb Start server: sudo service grafana-server start

The configuration can then be used to visualize the data. This is not the start of the talk. There will be separate articles about Grafana and Kibana.

Summarize

This article briefly summarizes the contents of the time series database, introduces the characteristics and compares the differences with traditional databases with Influxdb, and how to use Influxdb. Finally, we explain the use of Time series database architecture, logging and monitoring, through the Grafana for Visual data query analysis monitoring and so on. Article Address https://www.cnblogs.com/wenBlog/p/8297100.html

Time Series Database Overview

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Time Series Database Overview

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Time Series Database Overview

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support