Use redis bitmaps for fast, simple, and real-time statistics

Source: Internet
Author: User
Tags bitset

Original article: fast, easy, realtime metrics using redis bitmaps

Http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps)

The important statistical data of getspool.com is calculated in real time. Redis's bitmap allows us to perform similar statistics in real time and greatly save space. In a simulated environment where 0.1 billion million users are simulated, the typical statistical consumption of dailyunique users on a macbookpro is less than 50 ms, it occupies 16 MB of memory. Spool does not currently have 0.1 billion million users, but our solution can cope with this scale. We want to share how this works, and maybe it can help other startups.

Bitmap and redis bitmaps Quick Start (crash course on bitmap and redis bitmaps)

Bitmap (bitset)

Bitmap is a continuous binary number (0 or 1). The position of each digit is offset. Bitmap can execute and, or, XOR and other bit operations.

Bitmap count)

The bitmap Count counts the number of BITs whose value is 1 in bitmap. Bitmap counts are very efficient. For example, if a bitmap contains 1 billion bits and 90% bits are set to 1, it takes 21.1 ms to count bitmap on a MacBook Pro. Sse4 even has hardware instructions for bitmap counting on integer.

Redis bitmaps

Redis allows the key (Binary keys) and value (Binary values) of binary data ). Bitmap is the value of binary data. The setbit (Key, offset, value) Operation of redis sets the position 1 or 0 of the specified offset of the value of the specified key. The time complexity is O (1 ).

A simple example: daily active users

To count the number of login users today, we have created a bitmap, each of which identifies a user ID. When a user accesses our webpage or performs an operation, the location of the user is 1 in bitmap. In redis, the key value for obtaining this bitmap is obtained through the type and timestamp of the operation performed by the user.

In this simple example, every time a user logs on, redis. setbit (daily_active_users, user_id, 1) is executed ). Set the location of a bitmap to 1, and the time complexity to O (1 ). The bitmap statistics show that nine users have logged on today. The key of bitmap is daily_active_users, and its value is 1011110100100101.

Because daily active users change every day, a new bitmap needs to be created every day. We simply add the date to the end of the key to implement this function. For example, to count how many users have listened to at least one song in a music app a day, you can design the redis key of this bitmap as play: yyyy-mm-dd-hh. When a user listens to a song, we simply set the location of the user to 1 in bitmap, and the time complexity is O (1 ).

Redis.setbit(play:yyyy-mm-dd, user_id, 1)

The user who has heard the song today is the bitmap count of the bitmap whose key is play: yyyy-mm-dd. If you want to calculate by week or month, you only need to calculate the union of all bitmaps in this week or month to obtain a new bitmap and calculate bitmap.

Using these bitmaps for other complex statistics is also very easy. For example, Count premium users who have heard songs in November ):

(Play: 2011-11-01 audio play: 2011-11-02 audio... Audio play: 2011-11-30) Audio premium: 2011-11

Performance Comparison of 0.1 billion 128 million users (performance comparison using million users)

The following table shows that the time consumed for completing tasks on 0.1 billion million users is one day, one week, and one month.

Period Time (MS)
Daily 50.2
Weekly 392.0
Monthly 1624.8

Optimizations)

In the preceding example, We cache daily statistics, Weekly Statistics, and monthly statistics to redis to speed up the statistics.

This is a very flexible method. In this way, the extra bonus of caching can be more statistics, such as the intersection of weekly active mobile phone users-seeking the bitmap of mobile phone users and weekly active users. Or, if you want to count the number of active users in the past n days, the daily active users of the cache make such statistics easy-get the daily active users bitmap and today's bitmap from the cache for the past n-1 days and combine them ), the time consumption is 50 ms.

Sample Code)

The following Java code is used to count the active users who operate on a user on a certain day.

import redis.clients.jedis.Jedis;import java.util.BitSet;...    Jedis redis = new Jedis("localhost");    ...    public int uniqueCount(String action, String date) {        String key = action + ":" + date;        BitSet users = BitSet.valueOf(redis.get(key.getBytes()));        return users.cardinality();    }

The following Java code is used to count the active users of a user operation on a specified date.

import redis.clients.jedis.Jedis;import java.util.BitSet;...    Jedis redis = new Jedis("localhost");    ...    public int uniqueCount(String action, String... dates) {        BitSet all = new BitSet();        for (String date : dates) {            String key = action + ":" + date;            BitSet users = BitSet.valueOf(redis.get(key.getBytes()));            all.or(users);        }        return all.cardinality();    }

References:

[1] redis setbit command

Http://redis.io/commands/setbit

Repost this article please indicate the author and the source [Gary's influence] http://garyelephant.me, do not for any commercial purposes!

Author: Gary Gao (garygaowork [at] gmail.com) focuses on the internet, distributed, high-performance, nosql, automation, and software teams

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.