When I read the Redis source code, I saw two files: crc16.c, crc64.c. Let me take a brief look at the principle.
CRC, cyclic redundancy check code , is a common error-checking code in information system. In the course of the university curriculum, "Computer network", "Composition of computers" and other courses are mentioned. We may all understand its mathematical principle, it is not difficult to manually calculate a CRC check code on the test paper. But the computer is not human, the mathematical principle in the real world needs to transform to the computer algorithm to realize the goal. In fact, as a computer professional background people do not often use or access to the CRC computer algorithm implementation principle, usually the electronic subject background of people will contact more points. The computer is, of course, the original algorithm that can directly simulate the CRC (we manually calculate the algorithm), but the efficiency is certainly not high. So let's take a look at how the computer implements the CRC check code algorithm!
CRC concept
CRC fundamentals do not understand, please go to Wikipedia: Cyclic redundancy test code
Different CRC algorithms, such as CRC-1, CRC-8, CRC-16, are usually distinguished based on the number of bits of the CRC checksum (which is also equal to the generation of the highest power of the polynomial "G (x)"). In the case of the same power, different standards have different CRC algorithms. For example, g (x) the highest power is 16 when there are: Crc-16-ccitt, CRC-16-IBM and so on. Redis uses the Crc-16-ccitt standard, which is g (x):x + x + x5 + 1.
The usual way to characterize G (x) is to convert the polynomial to binary: 1 0001 0000 0010 0001. expressed in hexadecimal as: 0x11021. The number of storage space is 17 bits (2 bytes + 1 bits, the C language is the actual storage is 3 bytes), in fact, at the time of modulo two division, the divisor of the highest bit 1 and the divisor of the highest bit 1 is always aligned, its XOR result, total 0, it can be omitted, then g (x) = 0x1021 (2 bytes Saves a single byte of space.
The crc16.c file under the SRC directory of the source Redis:
static const uint16_t crc16tab[256]= {
0x0000,0x1021,0x2042,0x3063,0x4084,0x50a5,0x60c6,0x70e7,
0x8108,0x9129,0xa14a,0xb16b,0xc18c,0xd1ad,0xe1ce,0xf1ef,
0x1231,0x0210,0x3273,0x2252,0x52b5,0x4294,0x72f7,0x62d6,
0x9339,0x8318,0xb37b,0xa35a,0xd3bd,0xc39c,0xf3ff,0xe3de,
0x2462,0x3443,0x0420,0x1401,0x64e6,0x74c7,0x44a4,0x5485,
0xa56a,0xb54b,0x8528,0x9509,0xe5ee,0xf5cf,0xc5ac,0xd58d,
0x3653,0x2672,0x1611,0x0630,0x76d7,0x66f6,0x5695,0x46b4,
0xb75b,0xa77a,0x9719,0x8738,0xf7df,0xe7fe,0xd79d,0xc7bc,
0x48c4,0x58e5,0x6886,0x78a7,0x0840,0x1861,0x2802,0x3823,
0xc9cc,0xd9ed,0xe98e,0xf9af,0x8948,0x9969,0xa90a,0xb92b,
0x5af5,0x4ad4,0x7ab7,0x6a96,0x1a71,0x0a50,0x3a33,0x2a12,
0xdbfd,0xcbdc,0xfbbf,0xeb9e,0x9b79,0x8b58,0xbb3b,0xab1a,
0x6ca6,0x7c87,0x4ce4,0x5cc5,0x2c22,0x3c03,0x0c60,0x1c41,
0xedae,0xfd8f,0xcdec,0xddcd,0xad2a,0xbd0b,0x8d68,0x9d49,
0x7e97,0x6eb6,0x5ed5,0x4ef4,0x3e13,0x2e32,0x1e51,0x0e70,
0xff9f,0xefbe,0xdfdd,0xcffc,0xbf1b,0xaf3a,0x9f59,0x8f78,
0x9188,0x81a9,0xb1ca,0xa1eb,0xd10c,0xc12d,0xf14e,0xe16f,
0x1080,0x00a1,0x30c2,0x20e3,0x5004,0x4025,0x7046,0x6067,
0x83b9,0x9398,0xa3fb,0xb3da,0xc33d,0xd31c,0xe37f,0xf35e,
0x02b1,0x1290,0x22f3,0x32d2,0x4235,0x5214,0x6277,0x7256,
0xb5ea,0xa5cb,0x95a8,0x8589,0xf56e,0xe54f,0xd52c,0xc50d,
0x34e2,0x24c3,0x14a0,0x0481,0x7466,0x6447,0x5424,0x4405,
0xa7db,0xb7fa,0x8799,0x97b8,0xe75f,0xf77e,0xc71d,0xd73c,
0x26d3,0x36f2,0x0691,0x16b0,0x6657,0x7676,0x4615,0x5634,
0xd94c,0xc96d,0xf90e,0xe92f,0x99c8,0x89e9,0xb98a,0xa9ab,
0x5844,0x4865,0x7806,0x6827,0x18c0,0x08e1,0x3882,0x28a3,
0xcb7d,0xdb5c,0xeb3f,0xfb1e,0x8bf9,0x9bd8,0xabbb,0xbb9a,
0x4a75,0x5a54,0x6a37,0x7a16,0x0af1,0x1ad0,0x2ab3,0x3a92,
0xfd2e,0xed0f,0xdd6c,0xcd4d,0xbdaa,0xad8b,0x9de8,0x8dc9,
0x7c26,0x6c07,0x5c64,0x4c45,0x3ca2,0x2c83,0x1ce0,0x0cc1,
0xef1f,0xff3e,0xcf5d,0xdf7c,0xaf9b,0xbfba,0x8fd9,0x9ff8,
0x6e17,0x7e36,0x4e55,0x5e74,0x2e93,0x3eb2,0x0ed1,0x1ef0
};
uint16_t crc16(const char *buf, int len) {
int counter;
uint16_t crc = 0;
for (counter = 0; counter < len; counter++)
crc = (crc<<8) ^ crc16tab[((crc>>8) ^ *buf++)&0x00FF];
return crc;
}
The previous article refers to the CRC check code different institutions have different standards, where Redis follows the standard is the Crc-16-ccitt standard, which is also used by the XMODEM protocol CRC Standard, so also commonly used Xmodem CRC surrogate.
The algorithm principle of the code is not the author's first, this is a more classic "byte-based check table method of CRC code generation algorithm."
The following is an excerpt from a paper (see the final " references ").
In fact, there are two steps to simplify the original text, but the feeling does not need to understand. Note that the above symbols are modulo two, the fraction "--" is modulo two in addition, plus "+" is modulo two plus, that is, the XOR operation.
Here are a few concepts to define:
- The CRC16 check code is two bytes, so the uint16_t type (unsigned short int) is used in Redis's source code.
- CRC16 the data bit to verify is 8 bits
- In the process of solving CRC check code, we will use modulo two to divide, actually we do not care about its quotient q (x), only care about the remainder R (x), it is also two bytes size
- The remainder R (x) is divided into high-byte RH (x) and low-byte RL (x) two parts: R (x) = RH (x) * x^8 + RL (x) (this +, can be understood as XOR, can also be understood as +)
- Any number and 0 XOR result or this number
Observing the second part of the last polynomial, it can be found that this is also a CRC checksum calculation process, it solves the data is square brackets Content-The original check code of the high byte and the current data bit to be different or operation, set its result is dnew, and then the dnew again to find the CRC check code, Set the result to be CRC (Dnew), and then the CRC (Dnew) and the low byte of the original check code XOR.
The above equation, I briefly summarize (business can be ignored):
CRC (Mn+1 (x)) = CRC (RNH (x) + M0 (x)) + (RnL (x) * x^8)/g (x)
This equation can be found, both sides of the equal sign use the CRC algorithm, but its parameters are different, it is obvious that this is a recursive form. If the formula is simulated directly by computer, its time efficiency is very low, so the "look-up method" is invented.
Because the CRC algorithm to verify the data bits are 8 bits, so the parameters of the CRC algorithm only 256 possible, so the 256 parameters (data bits) of the CRC check code, save to the array, the actual calculation of the CRC check code, the table can be directly checked, its time complexity is O (1 )。
The popularization of CRC16 in the Redis source code directory, there is also a CRC64 file, that is, 64-bit CRC check code algorithm, in fact, and CRC16 look up the principle of the table is the same, it is the checksum of 8 bits of data, so its pre-generated CRC table (array) is also 256 elements, However, each of these elements is a uint64_t type (unsigned long int) CRC16 Tabular method can also be generalized to the CRC32 algorithm. It is also mentioned here that the algorithm does not necessarily check the 8-bit data, but also the 16-bit calibration, which is required in the CRC table 65536 (2^16) elements, waste storage space. can also be half-byte (4-bit) to verify, when the CRC table to store the number of elements is 32 (2^4), although saving memory, but the same data, each time only four bytes, it will lead to a lot of verification, the cost of more computation time. So each checksum of 8 bytes is a compromise scheme that combines time and space efficiency. Many algorithms are time and space, both of which cannot be combined.
--------------------------------------------
Resources
Minuting Jiang Wei. A software generation algorithm for cyclic redundancy check code based on byte-check table. Shandong: Journal of Shandong Mining Institute (Natural Science Edition), 1999, 18th vol. 2nd
The principle of CRC check Code (CRC16, CRC64) in Redis source code