Using Redis setbit and bitmap to count the number of users

Source: Internet
Author: User
Tags bitset


the company's statistical system receives a requirement to count the total number of users who have committed a behavior within the time period. And the length of the time period is variable. The number of business users is huge, and statistics system is real-time statistics, so the data storage, computing efficiency need a better solution. Below is an article on the Internet using Redis bitmap.   The important statistics of getspool.com are calculated in real time. Redis's bitmap allows us to perform similar statistics in real time, and is extremely space-saving. In a simulated environment of 128 million users, typical statistics such as "Daily Users" (Dailyunique users) consume less than 50ms and consume 16MB of memory on a single macbookpro. Spool does not yet have 128 million users, but our solution can deal with this scale. We want to share how this is done, perhaps to help other startups.

Bitmap and Redis Bitmaps QuickStart (Crash Course on Bitmap and Redis Bitmaps) Bitmap (i.e. Bitset)
The bitmap is a continuous series of 2 decimal digits (0 or 1), each of which is in the position of offset (offset) and performs and,or,xor and other bit operations on the bitmap.


Bitmap count (Population count)

The bitmap count is the number of bits in the bitmap that have a value of 1. Bitmap counts are highly efficient, for example, a bitmap contains 1 billion bits, 90% bits are set to 1, and a bitmap count on a MacBook Pro takes 21.1ms. SSE4 even has hardware instructions for sizing (integer) bitmap counting.


Redis Bitmaps Redis allows the use of binary data for key (binary keys) and binary data value (binary values). Bitmap is the value of binary data. Redis's Setbit (key, offset, value) operates at the specified offset (offset) for the specified key value at position 1 or 0, and the time Complexity is O (1).

A simple example: daily active users in order to count the number of users logged in today, we have established a bitmap, each identifying a user ID. When a user accesses our web page or performs an action, the location that identifies the user in bitmap is 1. The key value obtained for this bitmap in Redis is obtained by the type and timestamp of the user performing the operation.
In this simple example, the Redis.setbit (Daily_active_users, user_id, 1) is executed once each time the user logs in. The position of the corresponding position in the bitmap is 1, and the time Complexity is O (1). Statistics bitmap results show that there are 9 users logged in today. The key of bitmap is Daily_active_users, and its value is 1011110100100101.

Because daily active users change every day, you need to create a new bitmap every day. We simply added the date to the key behind it to implement this function. For example, to count how many users have heard at least one song in a music app on a given day, you can design this bitmap Redis key as Play:yyyy-mm-dd-hh. When the user listens to a song, we simply put the user's location in bitmap to 1, and the time Complexity is O (1).

[Java]

    • Redis.setbit (PLAY:YYYY-MM-DD, user_id, 1)

Redis.setbit (PLAY:YYYY-MM-DD, user_id, 1)
The user who has listened to the song today is the bitmap count of the bitmap that key is PLAY:YYYY-MM-DD. If you want to count by week or month, just get a new bitmap for all bitmap in this week or month, and do a bitmap counting on it.



Using these bitmap to do other complex statistics is also very easy. For example, a premium user who has listened to a song in November:
(Play:2011-11-01∪play:2011-11-02∪ ... ∪PLAY:2011-11-30) ∩premium:2011-11


128 million user performance comparison (performance comparison using million users) the following table shows the time-consuming comparisons of user statistics that were completed on 128 million users for 1 days, a week, and one months.
Period Time (MS)
Daily 50.2
Weekly 392.0
Monthly 1624.8

Optimization (optimizations) in the previous example, we cache daily statistics, weekly statistics, and monthly statistics to Redis to speed up the statistics.
This is a very flexible approach. This extra bonus for caching is the ability to do more statistics, such as weekly active mobile users-the intersection of mobile phone users ' bitmap and weekly active users. Or, if you want to count the number of active users in the past n days, the cached day active user makes this statistic simple-get the daily active user bitmap from the cache for the past n-1 days and today's bitmap, set them to union, and the time consumption is 50ms.

The following Java code is used to count a user action on a specified number of days for an active user.

Jedis redis = new Jedis("localhost");
public int uniqueCount(String action, String... dates) {
    BitSet all = new BitSet();
    for (String date : dates) {
        String key = action + ":" + date;
        BitSet users = BitSet.valueOf(redis.get(key.getBytes()));
        all.or(users);
    }
    return all.cardinality();
}






Using Redis setbit and bitmap to count the number of users


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.