SQL scaling using Redis

Source: Internet
Author: User
Tags timedelta
I like Redis. This is the only technology that makes you wonder why you need to compile it for so long. Predictable, high-performance, and adaptable. This is my past few years.

I like Redis. This is the only technology that makes you wonder why you need to compile it for so long. Predictable, high-performance, and adaptable. This is my past few years.

I like Redis. This is the only technology that makes you wonder why you need to compile it for so long. Predictable, high performance, and adaptability, which is why I have been using it more and more in the past few years. Sentry mainly runs on PostgreSQL is no longer a secret (although it still relies on a series of other technologies)

I gave a keynote speech at Python Nordeste more than a week ago. To some extent, I can only make some quick summary. I decided to find some hackers to discuss the large use of Sentry, especially Redis technology. This article is an extension discussed in five minutes.

Ease competition between rows

We used things that evolved early in the Sentinel field to become the famous sentry. buffers. This is a simple system that enables us to implement a very effective buffer counter, a simple strategy for winning the last write. It is important to note that we almost eliminate any form of durability (this is a very acceptable way of working on the Sentinel ).

This operation is quite simple. Every time an update comes in, we do the following:

1. Create a hash key bound to a given object

2. incremental 'off' uses HINCRBY

3. HEST various LWW data (for example, "Last seen ")

4. Use the ZADD hashed key to 'hangzhou' and set the current Timestamp

Now, for each scale (in the case of the Sentel, this is 10 seconds), we need to dump these buffers and fan-out writes. This looks like the following:

1. Start using all ZRANGE keys

2. Fire a job into every pending key of RabbitMQ

3. keys specified by ZREM

Now the RabbitMQ job will be able to read and clear the hash table, and a set of pending updates will pop up. Note the following:

  • In the following example we want to display only one set quantity, we will use a set of sorting (for example, we need the 100 old set ).

  • If we end the multi-path sorting job to process a key value, this person will get no-oped because of another existing process of processing and clearing the hash.

  • With this model, we can ensure that "in most cases" only one row in SQL can be updated immediately, this solution reduces the lock issue we can foresee. This policy is useful for Sentry because it will process the data that is suddenly generated and eventually combined into the same counter.

    Speed Limit

    Its logic is quite direct, as shown below:

    Def incr_and_check_limit (user_id, limit ):

    Key = '{user_id}: {epoch}'. format (user_id, int (time ()/60 ))

    Pipe = redis. pipeline ()

    Pipe. incr (key)

    Pipe. expire (key, 60)

    Current_rate, _ = pipe.exe cute ()

    Return int (current_rate)> limit

    The speed limiting method we have clarified is one of the most basic functions of Redis in the cache service: Adding null key words. This method may eventually be used to achieve the same behavior in the cache service:

    Def incr_and_check_limit_memcache (user_id, limit ):

    Key = '{user_id}: {epoch}'. format (user_id, int (time ()/60 ))

    If cache. add (key, 0, 60 ):

    Return False

    Current_rate = cache. incr (key)

    Return current_rate> limit

    In fact, we finally adopt this strategy to allow the sentinel to track short-term data for different events. In this case, we usually sort user data to find the data of the most active users in the shortest time.

    Basic lock

    Although Redis is not highly available, our case locks make it a good tool for work. We didn't use these at the core of the Sentinel, but an example case is that we want to minimize concurrency and simple, non-operational operations if things seem to be already running. This is very useful for running similar cron tasks at intervals, but it does not have strong coordination.
    Using SETNX in Redis is quite simple:

    From contextlib import contextmanagerr = Redis () @ contextmanagerdef lock (key, nowait = True ):

    While not r. setnx (key, '1 '):

    If nowait:

    Raise Locked ('Try again soon! ')

    Sleep (0.01)

    # Limit lock time to 10 seconds r. expire (key, 10)

    # Do something crazy yield

    # Explicitly unlock r. delete (key)

    While the Sentinel inside the lock () uses memcached, there is absolutely no reason we cannot switch to Redis in it.

    Time Series Data

    Recently, we have created a new mechanism to store time series data in Sentry (included in sentry. tsdb. This is inspired by the RRD model, especially Graphite. We expect a fast and simple way to store the number of short-term (such as one month) time series, so as to facilitate high-speed data writing, especially in extreme cases, to calculate the potential short rate. Although this is the first model, we still expect to store data in Redis, which is also a simple example of using counters.

    As follows:

    {

    " : : ":{

    " ":

    }}

    Therefore, in this situation, we need to track the number of events. The event type is mapped to the enumeration type "1". The judgment time is 1 s, so our processing time needs to be measured in seconds. The hash eventually looks like this:

    {

    "1: 1399958363: 0 ":{

    "1": 53,

    "2": 72,

    }}

    A modifiable model may only use simple keys and only add some incremental registers to the storage area.

    "1: 1399958363: 0: 1": 53

    We chose the hash ing model for the following two reasons:

  • We can set all the keys to one time (this may also have a negative impact, but so far it is stable)

  • It is very important to compress key values.

  • In addition, the discrete key allows us to map the virtual discrete key value to a fixed number of key values, and allocate a single storage area here (we can use 64, ing to 32 physical nodes)

    Now, by using Nydus and its map () (dependent on a workspace) (), data query has been completed. The code for this operation is quite robust, but fortunately it is not huge.

    Def get_range (self, model, keys, start, end, rollup = None ):

    "To get a range of data for group ID = [1, 2, 3]: Start and end are both aggressive. >>> now = timezone. now () >>> get_keys (tsdb. models. group, [1, 2, 3], >>> start = now-timedelta (days = 1), >>> end = now )"""

    Normalize_to_epoch = self. normalize_to_epoch

    Normalize_to_rollup = self. normalize_to_rollup

    Make_key = self. make_key

    If rollup is None:

    Rollup = self. get_optimal_rollup (start, end)

    Results = []

    Timestamp = end

    With self. conn. map () as conn:

    While timestamp> = start:

    Real_epoch = normalize_to_epoch (timestamp, rollup)

    Norm_epoch = normalize_to_rollup (timestamp, rollup)

    For key in keys:

    Model_key = self. get_model_key (key)

    Hash_key = make_key (model, norm_epoch, model_key)

    Results. append (real_epoch, key, conn. hget (hash_key, model_key )))

    Timestamp = timestamp-timedelta (seconds = rollup)

    Results_by_key = defaultdict (dict)

    For epoch, key, count in results:

    Results_by_key [key] [epoch] = int (count or 0)

    For key, points in results_by_key.iteritems ():

    Results_by_key [key] = sorted (points. items ())

    Return dict (results_by_key)

    The reason is as follows:

  • Generate required keys.

  • Results are given, and they are mapped to the current storage zone based on the specified time interval and given key values.

  • Simple choice

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.