Summary of basic concepts of 180726-INFLUXDB

Last Update:2018-07-26 Source: Internet

Author: User

Tags influxdb

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Summary of basic concepts of INFLUXDB

Influxdb as a time series database, compared with the traditional relational database, there are some differences, the following as far as possible in a simple and concise way to introduce the relevant terminology concepts

I. Basic CONCEPTS

MySQL	Influxdb	Description
Database	Database	Database
Table	Measurement	The concept of a table similar to MySQL
Record	Tag + field + Timestamp	A row of data in a traditional table, mapped to a influxdb, can be divided into three

1. Database

database, compared to MySQL database, there is not much ambiguity

2. Measurement

Compared to the table in MySQL, from the actual experience, the most obvious difference between the two is that there is no separate way to create measurement, directly add a data, if measurement does not exist, then directly create and insert a piece of data

3. Point

This contrasts with the record in MySQL, in Influxdb, which represents a point in each table, a moment, filed data that satisfies a certain condition (in simple terms, timestamp + tag + filed).

Timestamp: timestamp, NS unit, each record must have this attribute, no display added, default to a
Tag: tag, KV structure, in database, tag + measurement build index together
- Participates in index creation, so it is suitable as a filter for queries
- Tag does not have too much data, it is better to have typical discrimination (similar to MySQL's indexing principle)
- Value is of type string
- Tag is optional, the measurement does not set the tag is OK
Field: Storing data, KV structure
- Data type: Long, String, Boolean, float

4. Series

Unique combination of Series:tag key and tag value

II. Example Analysis

The above several are basic concepts, the individual does not look impressive enough, the following examples are illustrated below:

Establish a measurement, save the performance status of an application, include the following metrics, write data to influxdb every second

Service machine: host=127.0.0.1
Service Interface: Service=app.service.index
qps:qps=1340
rt:1313
cpu:45.23
mem:4154m
load:1.21

1. Measurement Create

There are 7 indicator parameters, the first step is to distinguish between the tag and field, before the tag will be built index, recommended for the type can be distinguished, the value can be estimated by the field, so the above is the following distinction

Tag

Host
Servie

Field

QPs
Rt
Cpu
Mem
Load

An actual insert data such as

> insert myapp,host=127.0.0.1,service=app.service.index qps=1340,rt=1313,cpu=45.23,mem="4145m",load=1.21> select * from myappname: myapptime                cpu   host      load mem   qps  rt   service----                ---   ----      ---- ---   ---  --   -------1532597158613778583 45.23 127.0.0.1 1.21 4145m 1340 1313 app.service.index

A. Summary notes

In the Insert execution statement, the tag and tag, field and field are used to split between the tag and the field with a blank space.
Tag value is, String type, no double quotes required
field String type data, need to be placed in double quotation marks, otherwise it will be an error
If you need to display the add timestamp, add a space after filed, and then add a timestamp

B. Is it possible to have no field

Not measured, output is as follows

> insert myabb,host=123,service=indexERR: {"error":"unable to parse ‘myabb,host=123,service=index ‘: invalid field format"}

Can I have no tag

According to the previous instructions have been measured, you can

> insert myabb qps=123,rt=1231> select * from myabbname: myabbtime                qps rt----                --- --1532597385053030634 123 1231

2. Data analysis

Insert a few new data, the current data is

> select * from myappname: myapptime                cpu   host      load mem   qps  rt   service----                ---   ----      ---- ---   ---  --   -------1532597158613778583 45.23 127.0.0.1 1.21 4145m 1340 1313 app.service.index1532597501578551929 45.23 127.0.0.1 1.21 4145m 1341 1312 app.service.index1532597510225918132 45.23 127.0.0.1 1.21 4145m 1341 1312 app.service.about1532597552421996033 45.23 127.0.0.2 1.21 4145m 1341 1312 app.service.about

A. Series

How many series does the above four data correspond to?

According to the previous statement, Tagkey + Tagvalue determines to a series (actually measurement + retention policy + tags to determine), so the above table has a total of three series

127.0.0.1 | app.service.index
127.0.0.1 | app.service.about
127.0.0.2 | app.service.about

So what exactly is this series?

What can we do if we display the above data in a graphical way?

First we identify the application and its service name, and then look at the service performance on this machine, on the timeline
The translation comes from the Cpu/service as a search condition, with time as the timeline, the value (CPU,LOAD,MEM,QPS,RT) mapped to a two-dimensional coordinate as a point, and then all points are concatenated into lines, resulting in a continuous graph

So the series is the search condition above, and the concept of point is easy to understand.

Iii. Retention Policies

The first is the underlying concept of table data, and here is the strategy for saving data retention policy, which determines how long data is stored (meaning data can be deleted), how many backups are saved, how the cluster is processed, etc.

1. Basic instructions

Influxdb for Big Data time series database, so the amount of data can be very large, if all storage, the estimated cost of hard disk is not small, and some data may not need permanent storage, so there is this rentention policy

The INFLUXDB itself does not provide data deletion, so the way to control the amount of data is to define a data retention policy.

The purpose of defining a data retention policy is therefore to allow influxdb to know what data can be discarded, thus processing the data more efficiently.

2. Basic operation A. Query Policy

> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        true

Name: Names
Duration: Retention time, 0 means permanent save
Shardgroupduration:shardgroup storage time, Shardgroup is a basic storage structure of influxdb, should be greater than this time of the data in the query efficiency should be decreased.
Replican: Full name is replication, number of copies
Default: Whether it is the defaults policy

B. New policy

> create retention policy "2_hour" on hh_test duration 2h replication 1 default> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false2_hour  2h0m0s   1h0m0s             1        true

C. Modifying policies

> alter retention policy "2_hour" on hh_test duration 4h default> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false2_hour  4h0m0s   1h0m0s             1        true

D. Deleting a policy

> drop retention policy "2_hour" on hh_test> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false

After you delete the default policy, there is no default policy, is there a problem?

3. RP Understanding

After setting this policy, the expired data is automatically deleted, so how do you save the data?

For example, the default persistent save policy, there is a shardGroupDuration parameter, for 7 days, that is, 7 days of data in a shard, after the new add a shard

The Shard contains the actual encoded and compressed data and is represented by a TSM file on disk. Each shard belongs to the only one shard group. Multiple shard may exist in a single shard group. Each shard contains a specific set of series. All points on a given series in a given shard group are stored in the same shard (TSM file) on disk.

Iv. other 1. A grey and grey blog:https://liuyueyi.github.io/hexblog

A gray and gray personal blog, recording all the study and work in the blog, welcome everyone to visit

2. Disclaimer

The letter is not as good as, has been on the content, purely opinion, because of limited personal ability, inevitably there are omissions and errors, such as the detection of bugs or better suggestions, welcome criticism, please feel grateful

Weibo address: small Gray Gray Blog
QQ: A grey/grey/3302797840

3. Scan for attention

Small grey ash blog& public number

Knowledge Planet

Summary of basic concepts of 180726-INFLUXDB

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More