This is a creation in Article, where the information may have evolved or changed.
Recently a Spam project encountered a problem, there is user feedback can not be liked, check to find out is because the Redis Key has no expiration time, causing the status has not been cleared. This piece of code logic involves transactional operations.
Business logic
Each user saves minutes, hours, days, three dimensions of the count, when a certain threshold is reached, it is considered that the user frequently operation, belonging to the Spam operation, involving the Redis code as follows:
Original business logic
The specific description of the logic is
1. Perform INCR operation on the given Key first, obtain the Value after increment
2. Determine whether to set the first time based on value
3. If 1, set a certain expiration time
Then you can see that INCR and Setttl are originally a transaction, and the logic of MySQL is that these two operations should be between the Start Transaction and End transactions. It is not currently a transaction, and it is likely that there will be a problem in both operations, resulting in no TTL expiration time.
Cache cluster Issues
The back-end Redis Cache cluster is Twemproxy + Redis, and the auto_eject_hosts is turned on, viewing the log, you will find that occasionally twemproxy will be kicked out of the back-end Redis added to the situation, it is likely to happen at this time
1. Count Key just INCR completed, this time kicked out
2. Determine the return Value of 1, set the TTL, but at this point the distribution of Key has changed
3. Back-end Redis is added back to Proxy, key distribution back to the 1th step, the current key lost TTL
Improved
With these steps, it is much easier to make improvements, and the first step is to use Redis-port to erase all keys without TTL, specifically redis-port how to use them, and refer to my other shares. The second step is to make multi-layered judgments. At the time, two implementations were made, one of which:
Improvement Programme I
The scheme will set the TTL when the Value is 1~4, assuming that the probability of a set failure is 1%, then the probability of 4 failures is really 1% * 1% * 1% * 1%, which is typically overridden by multiple settings to overwrite the failure rate.
However, the scheme is a fuzzy hypothesis, for millions other operations, the occurrence of the number of TTL is still very considerable, the final adoption of scenario two:
Improvement Programme two
The TTL is obtained after each Incr, depending on the return value to determine if the setting is required. This can result in a more Redis operation per business request, which is fully acceptable in performance.
Redis transactions
There must be a small partner asking if you can use Redis's own transactions because we are using Proxy so we can't use them. and Redis transaction is not complete, also very chicken.
Particle size/Frequency Control Service
This kind of counting service, which is currently implemented as a fixed time period, should ideally use the granularity/frequency Control service of the time sliding window, which is the next item we will do
Subsequent
Services on-line will be a variety of unexpected problems, perhaps, the fun comes from this.