The cornerstone of Redis memory management ZMALLC.C Source code interpretation (ii)

Source: Internet
Author: User

In the last blog post, I introduced several commonly used functions in the zmalloc.c file. Next, I will introduce you to other functions in this file. In fact, many functions in this article are more than the functions in the previous article. Interesting, and involves a lot of operating system knowledge. The first few functions are relatively simple, they are carried over in one stroke, and the latter are the focus of learning.

Appetizer zmalloc_enable_thread_safeness
void zmalloc_enable_thread_safeness (void) {
    zmalloc_thread_safe = 1;
}
        zmalloc_thread_safe is a global static variable (static int). It is an indicator of whether the operation is thread safe. 1 means thread safe, 0 means non-thread safe.
zmalloc_used_memory
size_t zmalloc_used_memory (void) {
    size_t um;

    if (zmalloc_thread_safe) {
#if defined (__ ATOMIC_RELAXED) || defined (HAVE_ATOMIC)
        um = update_zmalloc_stat_add (0);
#else
        pthread_mutex_lock (& used_memory_mutex);
        um = used_memory;
        pthread_mutex_unlock (& used_memory_mutex);
#endif
    }
    else {
        um = used_memory;
    }

    return um;
}
        The operation to be completed by this function is to return the value of the variable used_memory (used memory), so its function is to query the memory size currently allocated by the system for Redis. The amount of code itself is not large, but it involves query operations in thread-safe mode. Mutex is used to achieve thread synchronization. The content of the mutex lock has been briefly introduced in the previous article. In short, remember to lock (pthread_mutex_lock) and unlock (pthread_mutex_unlock). After adding a mutex, you can ensure that the subsequent code can only be executed by one thread at a time.
zmalloc_set_oom_handler
void zmalloc_set_oom_handler (void (* oom_handler) (size_t)) {
    zmalloc_oom_handler = oom_handler;
}
        The function of this function is to assign a value to zmalloc_oom_handler. zmalloc_oom_handler is a function pointer that represents the action taken when out of memory (abbreviated oom), its type is void (*) (size_t). Therefore, the parameter of the zmalloc_set_oom_handler function is also of type void (*) (size_t). When calling, just pass a function name of this type. However, zmalloc_oom_handler initializes the default value at the time of declaration-zmalloc_default_oom (). It was also introduced in the previous blog post. zmalloc_size
#ifndef HAVE_MALLOC_SIZE
size_t zmalloc_size (void * ptr) {
    void * realptr = (char *) ptr-PREFIX_SIZE;
    size_t size = * ((size_t *) realptr);
    / * Assume at least that all the allocations are padded at sizeof (long) by
     * the underlying allocator. * /
    if (size & (sizeof (long) -1)) size + = sizeof (long)-(size & (sizeof (long) -1));
    return size + PREFIX_SIZE;
}
#endif
        This code is quite similar to the content of the zfree () function I introduced in the previous blog post. You can read that blog post. Here again, zmalloc (size) will apply for more sizeof (size_t) bytes of memory when allocating memory [8 bytes in 64-bit systems], that is, call malloc (size + 8), so apply in total Allocate size + 8 bytes, zmalloc (size) will store the size value in the 8 bytes starting from the first address of the allocated memory. In fact, because of memory alignment, malloc (size + 8) may allocate more memory +8 is more, the purpose is to make up a multiple of 8, so the actual allocated memory size is size + 8 + X [(size + 8 + X)% 8 == 0 (0 <= X <= 7)]. Then the memory pointer will be shifted to the right by 8 bytes. zfree () is an inverse operation of zmalloc (), and the purpose of zmalloc_size () is to calculate the total size of size + 8 + X. -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
        This function is a conditionally compiled function. By reading the zmalloc.h file, we can know that zmalloc_size () has different macro definitions according to different platforms, because some platforms provide functions to query the actual size of allocated memory. You can #define zmalloc_size (p) directly:
tc_malloc_size (p) [tcmalloc]
je_malloc_usable_size (p) 【jemalloc】
malloc_size (p) [Mac system]
When none of these three platforms exist, customize it, which is the source code above. -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
Big meal zmalloc_get_rss Get the size of RSS. This RSS is not the RSS we often see on the network, but refers to Resident Set Size, which means the size of the space that the current process actually resides in memory, which does not include being exchanged swap) out space. With a little knowledge of the operating system, you will know that the memory space we applied for will not be all resident memory, and the system will replace some of the temporarily unused parts from the memory to the swap area (we all know that there is a Swap space). The general operation of this function is to search in the / proc / <pid> / stat [<pid> of the current process] file in the current process. The 24th field of the file is RSS information, and its unit is pages (the number of memory pages)
size_t zmalloc_get_rss (void) {
    int page = sysconf (_SC_PAGESIZE);
    size_t rss;
    char buf [4096];
    char filename [256];
    int fd, count;
    char * p, * x;

    snprintf (filename, 256, "/ proc /% d / stat", getpid ());
    if ((fd = open (filename, O_RDONLY)) == -1) return 0;
    if (read (fd, buf, 4096) <= 0) {
        close (fd);
        return 0;
    }
    close (fd);

    p = buf;
    count = 23; / * RSS is the 24th field in / proc / <pid> / stat * /
    while (p && count--) {
        p = strchr (p, '');
        if (p) p ++;
    }
    if (! p) return 0;
    x = strchr (p, '');
    if (! x) return 0;
    * x = '\ 0';

    rss = strtoll (p, NULL, 10);
    rss * = page;
    return rss;
}
         The beginning of the function:
    int page = sysconf (_SC_PAGESIZE);
        Query the size of the memory page by calling the library function sysconf () [you can view the details in man sysconf].
Next:
    snprintf (filename, 256, "/ proc /% d / stat", getpid ());
        getpid () is to get the id of the current process, so the function of snprintf () is to save the absolute path name of the stat file corresponding to the current process to the character array filename. [I have to praise the concept of "everything is a file" in Unix-like systems]
    if ((fd = open (filename, O_RDONLY)) == -1) return 0;
    if (read (fd, buf, 4096) <= 0) {
        close (fd);
        return 0;
    }
        Open the / proc / <pid> / stat file in read-only mode. Then read 4096 characters from it into the character array buf. If it fails, close the file descriptor fd, and exit (personally feel that because of an error, it is better to return -1).
    p = buf;
    count = 23; / * RSS is the 24th field in / proc / <pid> / stat * /
    while (p && count--) {
        p = strchr (p, '');
        if (p) p ++;
    }
        RSS is in the 24th field position in the stat file, so it is after the 23rd space. Observe the while loop. The string function strchr () is used in the loop body. This function queries the space character in the string p. If found, it returns the character pointer at the position of the space and assigns it to p. If it cannot find it, it returns a NULL pointer . The reason for p ++ is because p currently points to a space, and after performing the increment operation, it points to the first address of the next field. Cycling 23 times in this way, finally p points to the first address of the 24th field.
    if (! p) return 0;
    x = strchr (p, '');
    if (! x) return 0;
    * x = '\ 0';
        Because p may become a null pointer at the end of the loop, determine whether p is a null pointer. The next few operations are easy to understand, that is, the space after the 24th field is set to ‘\ 0’, so that p points to a general C-style string.
    rss = strtoll (p, NULL, 10);
    rss * = page;
    return rss;
        This code uses a string function-strtoll (): as the name implies, it means string to long long. It has three parameters. The first two parameters indicate the start and end positions of the character string to be converted (character pointer type). NULL and ‘\ 0’ are equivalent. The last parameter indicates "hexadecimal", here is the decimal system. Later, multiply and return by rss and page, because the actual value obtained by rssIt is actually the number of pages in the memory page. The page holds the size (in bytes) of each memory page. After multiplication, it indicates the actual memory size of RSS.
zmalloc_get_fragmentation_ratio
/ * Fragmentation = RSS / allocated-bytes * /
float zmalloc_get_fragmentation_ratio (size_t rss) {
    return (float) rss / zmalloc_used_memory ();
}
        This function queries the memory fragmentation ratio, which is the ratio of RSS to the total memory space allocated. You need to use zmalloc_get_rss () to get the RSS value, and then pass the RSS value as a parameter. -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
Memory fragmentation is divided into: internal fragmentation and external fragmentation
Internal fragmentation: memory space that has been allocated (which can clearly indicate which process belongs) but cannot be used until the process is released before it can be used by the system;
External fragmentation: It has not been allocated (not belonging to any process), but it is too small to be allocated to the memory free area of the new process that applies for memory space.
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
What zmalloc_get_fragmentation_ratio () wants to obtain is obviously the internal fragmentation rate.
zmalloc_get_smap_bytes_by_field
#if defined (HAVE_PROC_SMAPS)
size_t zmalloc_get_smap_bytes_by_field (char * field) {
    char line [1024];
    size_t bytes = 0;
    FILE * fp = fopen ("/ proc / self / smaps", "r");
    int flen = strlen (field);

    if (! fp) return 0;
    while (fgets (line, sizeof (line), fp)! = NULL) {
        if (strncmp (line, field, flen) == 0) {
            char * p = strchr (line, 'k');
            if (p) {
                * p = '\ 0';
                bytes + = strtol (line + flen, NULL, 10) * 1024;
            }
        }
    }
    fclose (fp);
    return bytes;
}
#else
size_t zmalloc_get_smap_bytes_by_field (char * field) {
    ((void) field);
    return 0;
}
#endif
A conditionally compiled function, of course, we must focus on the #if defined part.
   FILE * fp = fopen ("/ proc / self / smaps", "r");
        Use standard C's fopen () to open the / proc / self / smaps file as read-only. Briefly introduce the file. We have said before that there are many directories named after the process id in the / proc directory, which stores the state information of each process, and the content of the / proc / self directory is the same as them, self / Represents the status directory of the current process. The detailed image information of the process is recorded in the smaps file. The file is composed of multiple blocks with the same structure. Look at the contents of one of the blocks:
00400000-004ef000 r-xp 00000000 08:08 1305603 / bin / bash
Size: 956 kB
Rss: 728 kB
Pss: 364 kB
Shared_Clean: 728 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 728 kB
Anonymous: 0 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd ex mr mw me dw sd
Except for the first two lines and the last two lines, each other line has a field and the value of the field (in kb). [The specific meaning of each field, you Baidu]. Note that this is only a small part of the smaps file.
    while (fgets (line, sizeof (line), fp)! = NULL) {
        if (strncmp (line, field, flen) == 0) {
            char * p = strchr (line, 'k');
            if (p) {
                * p = '\ 0';
                bytes + = strtol (line + flen, NULL, 10) * 1024;
            }
        }
    }
Use fgets () to read the contents of the / proc / self / smaps file line by line
Then strchr () defines the p pointer to the position of the character k
Then set p to ‘\ 0’ and truncate to form a normal C-style string
The first character of the line pointed to by line, the position pointed by line + flen (the length of the field to be queried) is the space after the field name, you do not need to clear the space, strtol () ignores the space and can convert the string to int
The result of strol () conversion is multiplied by 1024 again. This is because the size in smaps is expressed in kB, and what we want to return is B (byte)
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
In fact, the / proc / self directory is a symbolic link to the directory named with the current id under the / proc / directory. We can enter the directory and type a few commands to test.
[email protected]: / proc / self # pwd -P
/ proc / 4152
[email protected]: / proc / self # ps aux | grep [4] 152
root 4152 0.0 0.0 25444 2176 pts / 0 S 09:06 0:00 bash
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
zmalloc_get_private_dirty
size_t zmalloc_get_private_dirty (void) {
    return zmalloc_get_smap_bytes_by_field ("Private_Dirty:");
}
        The source code is very simple. The essence of this function is to call zmalloc_get_smap_bytes_by_field ("Private_Dirty:"); the completed operation is to scan the / proc / self / smaps file and count the sum of all Private_Dirty fields. So what does this Private_Dirty mean? Let's continue to observe the structure of the / proc / self / smaps file I posted above. It consists of many parts with the same structure. Several of these fields have the following relationships:
Rss = Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty where:
Shared_Clean: memory shared by multiple processes, and its contents have not been modified by any process
Shared_Dirty: memory shared by multiple processes, but its contents are modified by a process
Private_Clean: a process exclusive memory, and its content has not been modified
Private_Dirty: memory exclusive to a process, but its contents are modified by the process
    In fact, the so-called shared memory generally refers to the use of shared libraries (.so files) in Unix systems. Shared libraries are also called dynamic libraries (meaning the same as .dll files under Windows). Load into memory. At this time, the code and data in the shared library may be called by multiple processes, so there will be a difference between Shared (Shared) and Private (Private), Clean (Clean) and Dirty (Dirty). In addition to the shared memory mentioned here, in addition to the shared library, it also includes a shared memory segment (shared memory) that is one of System V's IPC mechanisms ------------------- -------------------------------------------------- -------------------------------------------------- ---------------------------------------
Regarding the detailed discussion of the meaning of the fields Shared_Clean, Shared_Dirty, Private_Clean, Private_Dirty in the smaps file, a netizen conducted an in-depth exploration and formed a blog post. It is recommended to read:
"The Meaning of Linux / proc / $ pid / smaps"

"Calculation test of each field value of / proc / $ pid / smaps"

-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
Redis memory management cornerstone zmallc.c source code interpretation (2)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.