Summary of basic concepts of 180726-INFLUXDB

Source: Internet
Author: User
Tags influxdb

Summary of basic concepts of INFLUXDB

Influxdb as a time series database, compared with the traditional relational database, there are some differences, the following as far as possible in a simple and concise way to introduce the relevant terminology concepts

I. Basic CONCEPTS
MySQL Influxdb Description
Database Database Database
Table Measurement The concept of a table similar to MySQL
Record Tag + field + Timestamp A row of data in a traditional table, mapped to a influxdb, can be divided into three
1. Database

database, compared to MySQL database, there is not much ambiguity

2. Measurement

Compared to the table in MySQL, from the actual experience, the most obvious difference between the two is that there is no separate way to create measurement, directly add a data, if measurement does not exist, then directly create and insert a piece of data

3. Point

This contrasts with the record in MySQL, in Influxdb, which represents a point in each table, a moment, filed data that satisfies a certain condition (in simple terms, timestamp + tag + filed).

    • Timestamp: timestamp, NS unit, each record must have this attribute, no display added, default to a
    • Tag: tag, KV structure, in database, tag + measurement build index together
      • Participates in index creation, so it is suitable as a filter for queries
      • Tag does not have too much data, it is better to have typical discrimination (similar to MySQL's indexing principle)
      • Value is of type string
      • Tag is optional, the measurement does not set the tag is OK
    • Field: Storing data, KV structure
      • Data type: Long, String, Boolean, float
4. Series

Unique combination of Series:tag key and tag value

II. Example Analysis

The above several are basic concepts, the individual does not look impressive enough, the following examples are illustrated below:

Establish a measurement, save the performance status of an application, include the following metrics, write data to influxdb every second

    • Service machine: host=127.0.0.1
    • Service Interface: Service=app.service.index
    • qps:qps=1340
    • rt:1313
    • cpu:45.23
    • mem:4154m
    • load:1.21
1. Measurement Create

There are 7 indicator parameters, the first step is to distinguish between the tag and field, before the tag will be built index, recommended for the type can be distinguished, the value can be estimated by the field, so the above is the following distinction

Tag

    • Host
    • Servie

Field

    • QPs
    • Rt
    • Cpu
    • Mem
    • Load

An actual insert data such as

> insert myapp,host=127.0.0.1,service=app.service.index qps=1340,rt=1313,cpu=45.23,mem="4145m",load=1.21> select * from myappname: myapptime                cpu   host      load mem   qps  rt   service----                ---   ----      ---- ---   ---  --   -------1532597158613778583 45.23 127.0.0.1 1.21 4145m 1340 1313 app.service.index
A. Summary notes
    • In the Insert execution statement, the tag and tag, field and field are used to split between the tag and the field with a blank space.
    • Tag value is, String type, no double quotes required
    • field String type data, need to be placed in double quotation marks, otherwise it will be an error
    • If you need to display the add timestamp, add a space after filed, and then add a timestamp
B. Is it possible to have no field

Not measured, output is as follows

> insert myabb,host=123,service=indexERR: {"error":"unable to parse ‘myabb,host=123,service=index ‘: invalid field format"}
Can I have no tag

According to the previous instructions have been measured, you can

> insert myabb qps=123,rt=1231> select * from myabbname: myabbtime                qps rt----                --- --1532597385053030634 123 1231
2. Data analysis

Insert a few new data, the current data is

> select * from myappname: myapptime                cpu   host      load mem   qps  rt   service----                ---   ----      ---- ---   ---  --   -------1532597158613778583 45.23 127.0.0.1 1.21 4145m 1340 1313 app.service.index1532597501578551929 45.23 127.0.0.1 1.21 4145m 1341 1312 app.service.index1532597510225918132 45.23 127.0.0.1 1.21 4145m 1341 1312 app.service.about1532597552421996033 45.23 127.0.0.2 1.21 4145m 1341 1312 app.service.about
A. Series

How many series does the above four data correspond to?

According to the previous statement, Tagkey + Tagvalue determines to a series (actually measurement + retention policy + tags to determine), so the above table has a total of three series

    • 127.0.0.1 | app.service.index
    • 127.0.0.1 | app.service.about
    • 127.0.0.2 | app.service.about

So what exactly is this series?

What can we do if we display the above data in a graphical way?

    • First we identify the application and its service name, and then look at the service performance on this machine, on the timeline
    • The translation comes from the Cpu/service as a search condition, with time as the timeline, the value (CPU,LOAD,MEM,QPS,RT) mapped to a two-dimensional coordinate as a point, and then all points are concatenated into lines, resulting in a continuous graph

So the series is the search condition above, and the concept of point is easy to understand.

Iii. Retention Policies

The first is the underlying concept of table data, and here is the strategy for saving data retention policy, which determines how long data is stored (meaning data can be deleted), how many backups are saved, how the cluster is processed, etc.

1. Basic instructions

Influxdb for Big Data time series database, so the amount of data can be very large, if all storage, the estimated cost of hard disk is not small, and some data may not need permanent storage, so there is this rentention policy

The INFLUXDB itself does not provide data deletion, so the way to control the amount of data is to define a data retention policy.

The purpose of defining a data retention policy is therefore to allow influxdb to know what data can be discarded, thus processing the data more efficiently.

2. Basic operation A. Query Policy
> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        true
    • Name: Names
    • Duration: Retention time, 0 means permanent save
    • Shardgroupduration:shardgroup storage time, Shardgroup is a basic storage structure of influxdb, should be greater than this time of the data in the query efficiency should be decreased.
    • Replican: Full name is replication, number of copies
    • Default: Whether it is the defaults policy
B. New policy
> create retention policy "2_hour" on hh_test duration 2h replication 1 default> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false2_hour  2h0m0s   1h0m0s             1        true
C. Modifying policies
> alter retention policy "2_hour" on hh_test duration 4h default> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false2_hour  4h0m0s   1h0m0s             1        true
D. Deleting a policy
> drop retention policy "2_hour" on hh_test> show retention policies on hh_testname    duration shardGroupDuration replicaN default----    -------- ------------------ -------- -------autogen 0s       168h0m0s           1        false

After you delete the default policy, there is no default policy, is there a problem?

3. RP Understanding

After setting this policy, the expired data is automatically deleted, so how do you save the data?

For example, the default persistent save policy, there is a shardGroupDuration parameter, for 7 days, that is, 7 days of data in a shard, after the new add a shard

The Shard contains the actual encoded and compressed data and is represented by a TSM file on disk. Each shard belongs to the only one shard group. Multiple shard may exist in a single shard group. Each shard contains a specific set of series. All points on a given series in a given shard group are stored in the same shard (TSM file) on disk.

Iv. other 1. A grey and grey blog:https://liuyueyi.github.io/hexblog

A gray and gray personal blog, recording all the study and work in the blog, welcome everyone to visit

2. Disclaimer

The letter is not as good as, has been on the content, purely opinion, because of limited personal ability, inevitably there are omissions and errors, such as the detection of bugs or better suggestions, welcome criticism, please feel grateful

    • Weibo address: small Gray Gray Blog
    • QQ: A grey/grey/3302797840
3. Scan for attention

Small grey ash blog& public number

Knowledge Planet

Summary of basic concepts of 180726-INFLUXDB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.