Remote Storage of the Prometheus
Objective
Prometheus in the realm of container clouds, there is no doubt that more and more cloud-native components directly provide Prometheus metrics interfaces without additional exporter. Therefore, the use of Prometheus as the whole cluster monitoring program is appropriate. But metrics stores this block, Prometheus provides local storage, the TSDB time series database. The advantage of local storage is that operation is simple, starting Prometheus requires only one command, the following two boot parameters specify the data path and save time.
- Storage.tsdb.path:tsdb database path, default data/
- Storage.tsdb.retention: Data retention time, default 15 days
The disadvantage is that there is not a lot of metrics persistence. Of course prometheus2.0 after the compression of data ability has been greatly improved.
In order to solve the limit of single node storage, Prometheus does not implement the cluster storage itself, but provides the interface of remote reading and writing, so that the user can choose the appropriate time series database to realize Prometheus Extensibility.
The Prometheus is interfaced to other Remote Storage systems in two ways
- Prometheus writes metrics to the remote Storage in the standard format
- Prometheus reads from the remote URL in standard format metrics
Below I will focus on the Remote Storage scenario
Remote Storage Scenarios
Configuration file
Remote Write Write_relabel_configs
# The URL of the endpoint to send samples To.url: <string># Timeout for requests to the remote write endpoint. [Remote_timeout: <duration> | default = 30s]# List of remote write Relabel configurations.write_relabel_configs: [-<relabel_config> ...] # sets the ' Authorization ' header on every remote write request with the# configured username and password.# password and Password_file is mutually exclusive.basic_auth: [username: <string>] [Password: <string>] [Password_f Ile: <string>]# sets the ' Authorization ' header on every remote write request with# the configured bearer token. It is mutually exclusive with ' bearer_token_file '. [Bearer_token: <string>]# sets the ' Authorization ' header on every remote write request with the bearer token# rea D from the configured file. It is mutually exclusive with ' Bearer_token '. [Bearer_token_file:/path/to/bearer/token/file]# Configures the remote write request ' s TLS settings.tls_config: [<tls_config>]# Optional proxy URL. [Proxy_url: <string>]# Configures the queue used to write to remote storage.queue_config: # Number of samples to Buffer per shard Before we start dropping them. [Capacity: <int> | default = 100000] # Maximum number of shards, i.e. amount of concurrency. [Max_shards: <int> | default =] # Maximum number of samples per send. [Max_samples_per_send: <int> | default = +] # Maximum time a sample would wait in buffer. [Batch_send_deadline: <duration> | default = 5s] # Maximum number of times to retry a batch on recoverable errors . [Max_retries: <int> | default = ten] # Initial retry delay. Gets doubled for every retry. [Min_backoff: <duration> | default = 30ms] # Maximum retry delay. [Max_backoff: <duration> | default = 100ms]
Remote Read
# The URL of the endpoint to query From.url: <string># An optional list of equality matchers which has to be# Prese NT in a selector to query the remote read endpoint.required_matchers: [<labelname>: <labelvalue> ...] # Timeout for requests to the remote read endpoint. [Remote_timeout: <duration> | default = 1m]# Whether reads should be made for queries for time ranges that# the LO Cal Storage should has complete data for. [Read_recent: <boolean> | default = False]# sets the ' Authorization ' header on every remote read request with the# Configured username and password.# password and password_file are mutually exclusive.basic_auth: [Username: <string& Gt ] [Password: <string>] [password_file: <string>]# sets the ' Authorization ' header on every remote read R Equest with# the configured bearer token. It is mutually exclusive with ' bearer_token_file '. [Bearer_token: <string>]# sets the ' Authorization ' header on every remote READ request with the bearer token# read from the configured file. It is mutually exclusive with ' Bearer_token '. [Bearer_token_file:/path/to/bearer/token/file]# Configures the remote read request ' s TLS settings.tls_config: [<tl s_config>]# Optional proxy URL. [Proxy_url: <string>]
Ps
- The Write_relabel_configs configuration item in the remote write configuration takes full advantage of Prometheus's powerful relabel capabilities. You can filter the metrics that need to be written to the remote Storage.
For example: Select the specified metrics.
remote_write: - url: "http://prometheus-remote-storage-adapter-svc:9201/write" write_relabel_configs: - action: keep source_labels: [__name__] regex: container_network_receive_bytes_total|container_network_receive_packets_dropped_total
- In the global configuration External_labels, the federated and remote read-write in Prometheus can be considered to set the configuration item, thus differentiating the clusters.
global: scrape_interval: 20s # The labels to add to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: cid: '9'
Scenarios for existing Remote Storage
Now the community has implemented the following Remote storage scenarios
- Appoptics:write
- Chronix:write
- Cortex:read and Write
- Cratedb:read and Write
- Elasticsearch:write
- Gnocchi:write
- Graphite:write
- Influxdb:read and Write
- Opentsdb:write
- Postgresql/timescaledb:read and Write
- Signalfx:write
Some of the above storage is write-only support. Actually study the source code, can support remote reading,
Query matching that depends on whether the store supports regular expressions. The implementation of the next section will explain Prometheus-postgresql-adapter and how to achieve a adapter.
Simultaneous support for remote read-write
- Cortex from the Weave company, the entire structure of the Prometheus to do the upper package, using a lot of components. Slightly more complex.
- InfluxDB Open source version does not support clustering. For large metrics, the write pressure is large, and then the Influxdb-relay scheme is not really highly available. Of course you're hungry. Open source Influxdb-proxy, interested can try.
- Cratedb is based on ES. Not much to know
- Timescaledb personally preferred the program. Traditional operation and maintenance of pgsql familiarity with the high, operation and maintenance of reliable. Currently supports high availability for streaming replication scenarios.
Postscript
In fact, if the collected metrics is used for data analysis, you can consider the Clickhouse database, cluster scheme and write performance, and support remote read and write. This piece is under study. After a certain result and then write a special article to read. At present our persistence plan is ready to use Timescaledb.