As a new leader in the field of application performance, OneAPM recently released the heavyweight new product--cloud Insight data management platform, which is used to monitor all the underlying components and manage the data via tag tags.
Recently, the Cloud Insight (Ci) probe dashboard features a heavy launch, the default installation of the probe, the configuration platform service will automatically generate the corresponding dashboard, and the dashboard will contain all the data. In addition, this article will focus on several of the REDIS monitoring indicators and some noteworthy parts, hoping to use Redis readers to bring some help.
Instrument panel
Any time period data query
The default is to display only the last hour of data, and now on the dashboard you can select a fixed period of time to view data, within 7 days, 15 days, you can also customize the specific time period, of course, the default display is 30 minutes of data.
Data filtering
With the complexity of today's business, one application will certainly be deployed on multiple servers, which requires monitoring multiple servers at the same time, if only to see a certain server metrics, dashboards will come in handy! Typically, dashboard data is a collection of multiple server data that you can filter on by host name if you want to see individual server data. In addition, there are several filter conditions, such as device URL tag, Docker can choose image and so on. Later on, we will launch a custom dashboard, easily aggregated, filtered, and retrieved via tag tags.
Monitoring Redis Metrics
Cloud Insight will monitor the following metrics for Redis
1 aof.last_rewrite_time The time (in seconds) used by the last rewrite operation
2 aof.rewrite rewrite times
3 clients.biggest_input_buf Maximum input buffer for the current client connection
4 clients.blocked blocked clients
5 clients.longest_output_list The maximum output list for the current client connection
6 cpu.sys system cpu
7 cpu.sys_children background process sys cpu usage
8 cpu.user redis server user cpu usage
9 cpu.user_children user cpu usage by background process
10 info.latency_ms Average time spent by Redis server in response to latency measures
11 keys.evicted Number of keys evicted due to memory size limitation
12 keys.expired Total number of keys expired since startup
13 mem.fragmentation_ratio used_memory_rss / used_memory ratio, under normal circumstances, used_memory_rss is slightly higher than used_memory. When there are more memory fragments, mem_fragmentation_ratio will be larger, which can reflect whether there are many memory fragments
14 memory used by mem.lua lua engine
15 mem.peak peak memory usage
16 mem.rss memory allocated by the system to redis (ie, resident memory)
17 mem.used uses memory, unit B
18 net.clients clients connected
19 net.commands commands run per second
20 net.rejected The number of rejected connections due to the maximum number of client connections
21 slaves connected by net.slaves
22 perf.latest_fork_usec Time (in ms) used by the last fork operation
23 pubsub.channels number of channels currently in use / number of publish / subscribe channels
24 pubsub.patterns number of patterns currently in use / number of publish / subscribe patterns
25 rdb.bgsave Persisting data through child processes
26 rdb.changes_since_last RDB changes since the last dump
27 rdb.last_bgsave_time timestamp of last save
28 replication.master_repl_offs global data synchronization offset
29 stats.keyspace_hits Number of keys successfully found in the main dictionary (todo)
30 stats.keyspace_misses the number of keys not found in the main dictionary (todo)
For each of the above Redis indicators, we will focus on several items in this article, categorized as follows:
Performance indicator memory indicator basic activity indicator Persistence indicator Error indicator
Performance Index
Low error rate, good performance is one of the top indicators of system health.
Indicator: Latency
Latency is an indicator of the time between a client sending a request and the actual server response. Tracking latency is the most straightforward way to detect Redis performance changes. Due to the nature of the Redis single thread, the outliers of the delayed distribution can cause serious bottlenecks. Because the response time of one request is too long, the delay of all subsequent requests is increased. So once you've identified the problem with the delay, you need to take some steps to diagnose and solve the performance problem.
Indicator: Instantaneousopsper_sec
The process of tracking Redis instance command processing is critical to diagnosing high latencies. High latencies can be caused by the following issues, backlog of command queues, slow commands, network connection timeouts, and so on. You can see by measuring the number of instructions processed per second, if it is almost constant, it is not a computationally intensive command, and if one or more slow commands cause latency problems, you will see how many commands you drop or fall per second.
Comparing the number of processing commands per second to historical data, it can be used as a sign of a low command or slow command blocking system. The low command volume may be normal, or it may be an upstream problem.
Indicator: Hit rate
When using Redis as the cache, the monitor cache hit rate can tell you if the cache usage is valid. Low hit ratio means the customer is looking for a nonexistent key. Redis does not provide a direct hit ratio indicator, but we can calculate it this way:
The keyspace_misses indicator is discussed in the Error indicators section.
Low hit ratios can be caused by a number of factors, including data expiration and a lack of memory allocated by Redis (this can cause key eviction). Low hit ratios can cause your application to increase latency because they have to get data from a slower alternative resource.
Memory metrics
Indicator: Used_memory
The use of memory is a key component of Redis performance. If the used_memory exceeds the total system available memory, the operating system will start swapping old or unused portions of memory. Each time the swap is written to disk, it can severely affect performance. Reads and writes from disk up to 5 orders of magnitude (100000x! ), much slower than reading from memory (0.1µ memory with 10 milliseconds disk).
You can configure Redis to limit a certain amount of memory. Set the maxmemory directive in the redis.conf file so that you can directly control the amount of redis memory usage. MaxMemory Configure an eviction policy to determine how Redis should release memory.
Indicator: Memfragmentationratio
The memFragmentationratio indicator shows the memory utilization that Redis allocates to the operating system.
Understanding the MemFragmentationratio data metrics is an important step in understanding the performance of your Redis instance. Fragmentation ratio greater than 1 indicates that fragmentation is occurring. Over 1.5 indicates over-dispersion, that your Redis instance consumes 150% of physical memory; fragmentation ratio < 1 means redis needs are larger than the memory available to your system, resulting in an exchange. Swapping to disk will result in a significant increase in latency. Ideally, the operating system allocates a contiguous segment in physical memory with a fragmentation rate equal to 1 or slightly larger.
If your server fragmentation ratio is above 1.5, restarting your Redis instance will allow the operating system to recover memory that was previously unusable due to breakage.
Of course, if your Redis server fragmentation ratio is below 1, you may need to quickly increase available memory or reduce memory usage.
Indicator: Evicted_keys (memory only)
If you are using Redis as a cache, you can configure it to automatically clear keys after its hit maxmemory limit. If you are using Redis as a database or queue, you may prefer to swap eviction, in which case you can skip this indicator.
It is important to track the deletion of keys because Redis processes each operation sequentially, so a large number of keys will result in a lower hit rate, thus prolonging the wait time. If you are using TTL, you may not need to delete the key. In this case, if the indicator is always above zero, you will likely see an increase in latency in the instance. Most other configurations that do not use TTL will eventually run out of memory and start removing key. If you can accept this response time, then the corresponding stable recovery will be acceptable.
You can configure the key expiration policy from the command line:
redis-cli CONFIG SET maxmemory-policy <policy>
Policy location, you can enter the following parameters:
volatile-lru delete expiration set between newly used keys
volatile-ttl removes a key in the shortest time and survives with an expired set
volatile-random deletes a set of expirations between random keys.
allkeys-lru deletes the most recently used key from the set of all keys
allkeys-random removes a random key from the set of all keys
Indicator: blocked_clients
Redis provides a number of blocking commands to manipulate the list, Blpop, Brpop, and Brpoplpush are Lpop, Rpop, and Rpoplpush variants respectively. When the source list is non-empty, the command executes normally. When the source list is empty, the blocking command waits for the source to be populated before it executes, or a time-out is reached.
An increase in the number of blocked customers can be a sign of trouble, and delays or other problems will prevent the source list from being populated. Although a closed client itself does not cause an alert, if you see a constant nonzero value, you should be aware of this indicator.
Basic activity Indicators
Indicator: connected_clients
Typically access to Redis is an application involving a user (typically not directly accessing the database), and most applications will have a reasonable upper and lower limit on the number of connected clients. If the value deviates from the normal range, it indicates a problem. If it is too low, the upstream connection may be lost, and if it is too high, a large number of concurrent client connections may overwhelm your server's ability to process requests.
In any case, the maximum number of client connections is always determined by the operating system, Redis configuration, and network limitations. Monitoring client connections helps ensure that you have sufficient resources available for new client connections or administrative sessions.
Indicator: connected_slaves
If your database is read-only and burdensome, you will most likely use the existing Redis master-slave database replication feature. In this case, it is critical to monitor the number of connections from the station. If the slaves connection changes and the expected mismatch, it may indicate that the host is down or the slaves instance is problematic.
Indicator: Masterlast iosecondsago
When using Redis's replication capabilities, slaves instances regularly check with their master communication time. Without communication, the time interval is very long, and the problem may occur on the primary Redis server, either from the server, or between the two. Due to the way Redis performs synchronization, there is an old data risk that runs slaves provides, so minimizing master-slave communication outages is critical. When the slave connects to the host, it sends a SYNC command whether it is the first time or reconnected. The SYNC command causes the main device to immediately start a background process to save the database to disk, while buffering all new command receive will modify the dataset. The data is sent to the client together with the command to complete the buffer when the background is saved. Every slave-to-machine synchronization can cause significant delays on the master instance.
Indicator: Keyspace
It's also a good idea to keep track of the number of keys in the database. As a memory data memory, the larger the key space, the more physical memory the Redis needs to ensure optimal performance, so Redis will continue to increase the key until it reaches the maxmemory limit, and it will start and increment the key at the same rate as the key, which will present a "parallel" graph.
If you are using Redis as a cache and look at low-hit graphs, your client may be requesting old data or deleted data. Tracking the number of your keyspace_misses will help you find out the cause after a while.
In addition, deleting a key may not be a good choice if you are using a Redis database or queue. As your key-value space grows, you may want to consider adding memory to your machine or splitting the dataset between hosts. Adding more storage is a simple and effective solution. When more resources are needed and a server is not available, you can consolidate multiple computers to partition or shard storage data. Redis enables partitioned shard storage without the need to relocate or exchange more key values.
Indicator: RDBLast save timeand RdbchangessinceLast_save
Typically, it should be aware of fluctuations in your data set. Too long a time interval written to the disk may cause data loss in the event of a server failure. Any changes made to the data set between the last save time and the failure time will be lost. Monitoring the RDBchangessince last save gives you a deeper understanding of the volatility of your data. If your dataset does not change much over a period of time, it is not a problem to write to disk too long. Track these two metrics to see how much data has been lost at a given point in time.
Error indicator
Indicator: Rejected_connections
Redis is able to handle multiple active connections, by default with 10000 of available client connections, you can set different maximum connections by executing redis.conf's maxclient instructions. If your Redis instance is already the maximum number of connections, any new connection attempts will be disconnected.
Note that your system may not support the number of connections that you requested for the Maxclient directive. Redis examines the kernel to determine the number of available file descriptors. If the number of available file descriptors is less than maxclient+32 (Redis retains 32 file descriptors for your own use), then the maxclient instruction is ignored and the number of file descriptors available is used.
See the documentation on Redis.io for more information on how Redis handles client connections.
Indicator: keyspace_misses
Each time Redis finds a key, there are only two possible outcomes: The key value exists, or the key value does not exist. Finding a nonexistent key can cause keyspacemisses to increment. If the indicator has been a non-0 value, it means that the client tries to find the key for the database that does not exist. If you do not use Redis as the cache, Keyspacemisses should reach or close to 0. Note that the null key response of any one blocking operation (Blpop,brpop and Brpoplpush) will cause keyspace_misses to increase.
Installing monitoring Redis
Installing the probe, configuring Redis
Said so much of the dry, it is time to install Cloud Insight to see what the specific can be displayed, the first is a key installation, directly in the server command line replication is good.
The default app name is the hostname, and you can configure it yourself/etc/oneapm-ci-agent/oneapm-ci-agent.conf.
Then there is the data for this host application on the Web side.
Install the platform monitoring, next is to realize the REDIS monitoring, only through simple configuration, copy redis.yaml.example template, modify, password tag and so on after restarting the probe, you can see the performance monitoring of Redis specific data.
After modifying the configuration file and restarting the probe, you have completed the monitoring of the Redis, and now look at the specific data metrics to understand the health of the Redis.
The indicators shown in the figure are the indicators introduced at the beginning of this article, in view of some indicators, this article also made a corresponding explanation.
Next function, wait for you to light up!
Cloud Insight now monitors host monitoring for Ubuntu, Mac OS X, Fedora, CentOS, and RedHat.
On platform service support, Cloud Insight has supported ActiveMQ Apache Apache Tomcat Apache Kafka Cassandra couchbase CouchDB Docker Elastic Search M emcached MongoDB MySQL Nginx PostgreSQL php-fpm Redis RabbitMQ 17 Services, involving DOCKER,PHP-FPM are in advance of the user's requirements to increase support, So we welcome you to work with us to create a more perfect data management platform, look forward to your participation!
This article is compiled and collated by OneAPM engineers. OneAPM is an emerging leader in application performance management, and Cloud Insight helps enterprise users and developers easily implement the ability to monitor infrastructure components and aggregate, filter, and filter data to create a more powerful data management platform. To read more technical articles, please visit the OneAPM official blog.
Turn from: http://news.oneapm.com/cloud-insight-redis/more: http://www.oneapm.com/ci/feature.html
Cloud Insight Dashboard Online | Full monitoring of Redis