SQL scaling using Redis

Last Update:2014-08-29 Source: Internet

Author: User

Tags rabbitmq redis cluster timedelta

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL scaling using Redis

I like Redis. This is the only technology that makes you wonder why you need to compile it for so long. Predictable, high performance, and adaptability, which is why I have been using it more and more in the past few years. Sentry mainly runs on PostgreSQL is no longer a secret (although it still relies on a series of other technologies)

I gave a keynote speech at Python Nordeste more than a week ago. To some extent, I can only make some quick summary. I decided to find some hackers to discuss the large use of Sentry, especially Redis technology. This article is an extension discussed in five minutes.

Ease competition between rows

We used things that evolved early in the Sentinel field to become the famous sentry. buffers. This is a simple system that enables us to implement a very effective buffer counter, a simple strategy for winning the last write. It is important to note that we almost eliminate any form of durability (this is a very acceptable way of working on the Sentinel ).

This operation is quite simple. Every time an update comes in, we do the following:

1. Create a hash key bound to a given object

2. incremental 'off' uses HINCRBY

3. HEST various LWW data (for example, "Last seen ")

4. Use the ZADD hashed key to 'hangzhou' and set the current Timestamp

Now, for each scale (in the case of the Sentel, this is 10 seconds), we need to dump these buffers and fan-out writes. This looks like the following:

1. Start using all ZRANGE keys

2. Fire a job into every pending key of RabbitMQ

3. keys specified by ZREM

Now the RabbitMQ job will be able to read and clear the hash table, and a set of pending updates will pop up. Note the following:

In the following example we want to display only one set quantity, we will use a set of sorting (for example, we need the 100 old set ).
If we end the multi-path sorting job to process a key value, this person will get no-oped because of another existing process of processing and clearing the hash.
The system can be extended on many Redis nodes only by placing a 'mounting 'primary key on each node.

With this model, we can ensure that "in most cases" only one row in SQL can be updated immediately, this solution reduces the lock issue we can foresee. This policy is useful for Sentry because it will process the data that is suddenly generated and eventually combined into the same counter.

Speed Limit

Out of the limitations of the Sentinel, we must end continuous denial-of-service attacks. We can solve this problem by limiting the connection speed, one of which is supported by Redis. This is undoubtedly a more direct implementation within the sentry. quotas range.

Its logic is quite direct, as shown below:

def incr_and_check_limit(user_id, limit): key = '{user_id}:{epoch}' .format(user_id, int(time() / 60 )) pipe = redis.pipeline() pipe.incr(key) pipe.expire(key, 60 ) current_rate, _ = pipe.execute() return int(current_rate) > limit

The speed limiting method we have clarified is one of the most basic functions of Redis in the cache service: Adding null key words. This method may eventually be used to achieve the same behavior in the cache service:

def incr_and_check_limit_memcache(user_id, limit): key = '{user_id}:{epoch}' .format(user_id, int(time() / 60 )) if cache.add(key, 0 , 60 ): return False current_rate = cache.incr(key) return current_rate > limit

In fact, we finally adopt this strategy to allow the sentinel to track short-term data for different events. In this case, we usually sort user data to find the data of the most active users in the shortest time.

Basic lock

Although Redis is not highly available, our case locks make it a good tool for work. We didn't use these at the core of the Sentinel, but an example case is that we want to minimize concurrency and simple, non-operational operations if things seem to be already running. This is very useful for running similar cron tasks at intervals, but it does not have strong coordination.
Using SETNX in Redis is quite simple:

from contextlib import contextmanagerr = Redis()@contextmanagerdef lock(key, nowait = True ): while not r.setnx(key, '1' ): if nowait: raise Locked( 'try again soon!' ) sleep( 0.01 ) # limit lock time to 10 seconds r.expire(key, 10 ) # do something crazy yield # explicitly unlock r.delete(key)

While the Sentinel inside the lock () uses memcached, there is absolutely no reason we cannot switch to Redis in it.

Time Series Data

Recently, we have created a new mechanism to store time series data in Sentry (included in sentry. tsdb. This is inspired by the RRD model, especially Graphite. We expect a fast and simple way to store the number of short-term (such as one month) time series, so as to facilitate high-speed data writing, especially in extreme cases, to calculate the potential short-term rate. Although this is the first model, we still expect to store data in Redis. It is also a simple example of using counters.

In the current model, we use a single hash map to store all time series data. For example, this means that all data items have a data type and a one-second lifecycle for the same hash key. As follows:

{ "<type enum>:<epoch>:<shard number>" : { "<id>" : <count> }}

Therefore, in this situation, we need to track the number of events. The event type is mapped to the enumeration type "1". The judgment time is 1 s, so our processing time needs to be measured in seconds. The hash eventually looks like this:

{ "1:1399958363:0" : { "1" : 53, "2" : 72, }}

A modifiable model may only use simple keys and only add some incremental registers to the storage area.

   "1:1399958363:0:1": 53

We chose the hash ing model for the following two reasons:

We can set all the keys to one time (this may also have a negative impact, but so far it is stable)
It is very important to compress key values.

In addition, the discrete key allows us to map the virtual discrete key value to a fixed number of key values, and allocate a single storage area here (we can use 64, ing to 32 physical nodes)

Now, by using Nydus and its map () (dependent on a workspace) (), data query has been completed. The code for this operation is quite robust, but fortunately it is not huge.

def get_range( self , model, keys, start, end, rollup = None ):

""" To get a range of data for group ID=[1, 2, 3]: Start and end are both inclusive. >>> now = timezone.now() >>> get_keys(tsdb.models.group, [1, 2, 3], >>> start=now - timedelta(days=1), >>> end=now) """

normalize_to_epoch = self .normalize_to_epoch normalize_to_rollup = self .normalize_to_rollup make_key = self .make_key if rollup is None : rollup = self .get_optimal_rollup(start, end) results = [] timestamp = end with self .conn. map () as conn: while timestamp > = start: real_epoch = normalize_to_epoch(timestamp, rollup) norm_epoch = normalize_to_rollup(timestamp, rollup) for key in keys: model_key = self .get_model_key(key) hash_key = make_key(model, norm_epoch, model_key) results.append((real_epoch, key, conn.hget(hash_key, model_key))) timestamp = timestamp - timedelta(seconds = rollup) results_by_key = defaultdict( dict ) for epoch, key, count in results: results_by_key[key][epoch] = int (count or 0 ) for key, points in results_by_key.iteritems(): results_by_key[key] = sorted (points.items()) return dict (results_by_key)

The reason is as follows:

Generate required keys.
Use the workspace to extract the smallest result set for all connection operations (Nydus is responsible for this ).
Results are given, and they are mapped to the current storage zone based on the specified time interval and given key values.

Simple choice

I am a person who prefers simple solutions to solve problems. Using Redis in this category is undoubtedly very suitable. Its documentation is so surprising, because it has a very low threshold. Although he has a compromise (mainly if you use persistence), they work very well and intuitively.

What problems does Redis solve for you?

Install and test Redis in Ubuntu 14.04

Redis cluster details

Install Redis in Ubuntu 12.10 (graphic explanation) + Jedis to connect to Redis

Redis series-installation, deployment, and maintenance

Install Redis in CentOS 6.3

Learning notes on Redis installation and deployment

Redis. conf

Redis details: click here
Redis: click here

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More