This is a creation in Article, where the information may have evolved or changed.
Objective
Influxdb is an open-source timing database based on Golang, with no additional dependencies, for recording metrics, events, and data analysis. This article discusses the INFLUXDB version above 1.2.0 . This article only discusses INFLUXDB data storage applications in monitoring, and does not talk about the entire set of monitoring options provided by INFLUXDB. This paper mainly discusses five aspects: Time Series Database selection, Influxdb basic concept, storage engine, practice, data aggregation.
Selection
Influxdb vs Prometheus
Influxdb integrates existing concepts, such as query syntax like SQL, and the engine is optimized from LSM, and the learning cost is relatively low.
Influxdb supported types are float,integers,strings,booleans,prometheus currently only supports float.
The time accuracy of INFLUXDB is nanosecond, and Prometheus is milliseconds.
Influxdb is just a database, and Prometheus provides a complete monitoring solution, and of course INFLUXDB provides a complete monitoring solution.
INFLUXDB supports fewer math functions, Prometheus is relatively more, and Influxdb is already satisfied with the current use.
2015 Prometheus is still in the development phase, relatively influxdb more stable.
INFLUXDB Support event Log,prometheus not supported.
For a more detailed comparison, please refer to: contrast.
The only thing we need is a database, the other components are developed on their own, and the data types stored are not just numbers, so we chose Influxdb. I hope the above comparison is helpful to everyone.
INFLUXDB Basic Concepts
Database
The database is a logical container that contains measurement, retention policies, continuous queries, and time series data, similar to MySQL's.
Measurement
Describes the storage structure of related data, similar to MySQL table, but does not need to create, write data when the automatic creation. Design recommendations for schema reference: Design recommendations.
Line Protocol
Line protocol defines the data write format for Influxdb, as follows:
weather,location=us,server=host1 temperature=82 1465839830100400200 | -------------------- -------------- | | | | | | | | |+-----------+--------+-+---------+-+---------+|measurement_name|,tag_set| |field_set| |timestamp|+-----------+--------+-+---------+-+---------+
Tag
The above location and server are tag Key,us and host1 is the tag value,tag is optional. However, it is best to add a tag when writing data because it can be indexed. The type of tag can only be a string.
Field
The above temperature is field key,82 is field value. Field value is used to show that the type supported by value is Floats,integers,strings,booleans.
Timestamp
The format is: RFC3339 UTC. The default is accurate to nanosecond, optional.
Series
Measurement, tag set, retention policy the same data set is counted as a series. Understanding this concept is critical because the data is stored in memory, and if the series is too many, it can cause oom.
Retention Policy
Retention policies include setting the amount of time the data is saved and the number of replicas in the cluster. The default configuration is: The RP is Autogen, the retention time is permanent, and the replica is 1. These configurations can be modified when the database is created.
Continuous Query
CQ is a pre-configured query command that automatically executes these commands on a regular basis and writes the results of the query to the specified measurement, which is used primarily for data aggregation. Specific reference: CQ.
Shard
Store data for a certain time interval, one shard for each directory, and the name of the directory is the Shard ID. Each shard has its own cache, Wal, TSM file, and compactor, which is designed to quickly navigate through time to the relevant resources to query the data, speed up the query process, and make the subsequent operations of bulk delete data very simple and efficient.
Storage Engine
Overview
The TSM tree is optimized for minor modifications based on the LSM tree. It consists of four main parts: Cache, Wal, TSM file, compactor.
Cache
When inserting data, write to the cache and then write to the Wal, and you can assume that the cache is the memory of the data in the Wal file.
WAL
Pre-write logs, compared to MySQL's binlog. The purpose of this is to persist the data and restore the cache through the Wal file when the system crashes.
TSM File
The maximum size for each TSM file is 2GB. When reached cache-snapshot-memory-size
, cache-max-memory-size
the limit is triggered when the cache is written to the TSM file.
Compactor
There are two main operations, one is to take a snapshot after the cache data reaches the threshold and generate a new TSM file. The other is to merge the current TSM file, combine multiple small tsm files into one, reduce the number of files, and perform some data deletion operations. These operations are done automatically in the background.
Directory structure
The INFLUXDB data store has three directories, namely Meta, Wal, data. Meta is used to store some metadata for the database, and there is a meta.db file under the meta directory. The Wal directory holds a pre-written log file ending with a. Wal. The data directory holds the actual stored file, ending with. Tsm. The basic structure is as follows:
-- wal -- test -- autogen -- 1 -- _00001.wal -- 2 -- _00002.wal-- data -- test -- autogen -- 1 -- 000000001-000000001.tsm -- 2 -- 000000001-000000010.tsm-- meta -- meta.db
Where test is the database name,Autogen is the name of the storage policy, and then the directory named in the next level directory is the ID value of Shard, for example, Autogen Storage Policy has two shard,id respectively 1 and 2,shard store data over a period of time. The next level of the directory is the specific files, respectively, .wal
and .tsm
the end of the file.
For a more detailed reference
Influxdb detailed TSM Storage engine parsing
Practice
Project Introduction
Gateway is used to detect and compress influxdb data for transmission across the machine room, using UDP to accept data.
Influxdb-relay is an officially available, high-availability solution, but it provides simple write functionality.
Influxdb-proxy is a highly available solution for replacing Influxdb-relay.
Pre-schema diagram
Usage issues
Influxdb-relay is an officially available, high-availability solution, but it provides simple write functionality. In the initial use, and there is not much problem, with Influxdb in the company's promotion, access to more and more, means more and more inquiries, which brings the following issues:
Grafana need to configure a number of data sources.
Users cannot subscribe to data based on measurement.
Database hangs, you need to modify the Grafana data source.
Maintenance difficulties, such as the need to add a database, users need to configure multiple data sources, not unified access point.
Users query Direct database, the user select *
database directly oom, the database will be restarted.
Relay provides rewrite functionality, data is kept in memory, and once influxdb hangs, it causes relay machine memory to go crazy.
The hole that I stepped on
max-row-limit
Not 0, will cause influxdb OOM. This issue has been fixed at this time, but there is a problem with Grafana display, set to 0 when configured.
When configuring query restriction parameters, there are some strange problems that are officially not limited, please keep the default configuration.
Without a schema specification, the Access party writes field tags, causing the memory to go crazy and finally oom. It is important to understand the concept of series.
The Write timeout defaults to 10s, and sometimes the data is written but returns 500. This time can be set to a large point.
Optimized architecture diagram
Influxdb-proxy was developed to solve the problem of use. Has the following features:
Supports both write and query functions, unified access point, similar to cluster.
Support rewrite function, write to file when write fails, and then write when back end is restored.
Restrict partial query commands and all delete operations.
Differentiate data with measurement for granular, support on-demand subscriptions.
Measurement first exact match, then prefix matches.
Provide statistics such as QPS, time-consuming, etc.
Data aggregation
CQ
INFLUXDB provides the function of data aggregation, which is the continuous Query mentioned in the basic concept above. By pre-defining CQ, you can periodically aggregate data based on different tags. Now it has a design problem: CQ is executed sequentially, the more CQ, the higher the data latency, the general delay in a few minutes. If more real-time aggregations are needed, CQ is not satisfied and needs to be introduced into other tools, such as Spark. For the syntax of CQ please refer to: syntax.
Spark
After an internal investigation, Spark+kafka was found to be a better polymerization solution. Spark supports streaming and supports SQL functionality, and we just need to change CQ to SQL. At present this is in the trial phase, has been on-line part function. The current processing flow is as follows:
Summarize
The entire architecture described above has already supported the hungry 20,000 machine monitoring, the current number of writes per second is 300k. The number of machines in the back-end influxdb is about 20 units, and maintenance costs are almost zero. Our focus has now shifted from influxdb to data aggregation and analysis.
Record: "Liu Ping: hungry Influxdb actual combat analysis"