QR code generation details and principles [go] to generate details
QR Code, also known as QR Code, is short for Quick Response. It is a very popular encoding method on mobile devices in recent years. It can store more information than traditional Bar Code codes, it can also represent more data types, such as characters, numbers, Japanese, and Chinese. I learned about the QR code Image Generation in the past two days. I think this is a cryptographic algorithm. I will write this article here to reveal it. For those who want to learn it together.
For QR Code Specification, see this PDF: http://raidenii.net/files/datasheets/misc/qr_code.pdf
Basic knowledge
First, let's talk about the two-dimensional code with a total size of 40. The official Version is called Version. Version 1 is the matrix of 21x21, Version 2 is the matrix of 25x25, and Version 3 is the size of 29. Each added version increases by 4, the formula is: (V-1) * 4 + 21 (V is the Version number) up to Version 40, (40-1) * 4 + 21 = 177, so the maximum is 177x177 square.
The following is an example of a QR code:
Positioning Pattern
- Position Detection Pattern is the positioning Pattern used to mark the size of the rectangle of the QR code. The three positioning Patterns have white edges called Separators for Postion Detection Patterns. Three, not four, means that three can mark a rectangle.
- Timing Patterns is also used for locating. The reason is that there are 40 types of QR codes. If the size is too large, there must be a standard line. Otherwise, the scanning may fail.
- Alignment Patterns only requires the QR code above Version 2 (including Version2), which is also used for positioning.
Functional data
- Format Information exists in all dimensions and is used to store formatted data.
- If Version Information is greater than or equal to Version 7, you need to reserve two versions in Area 3x6 to store some Version Information.
Data and Error Correction Codes
- Aside from the above, Data Code and Error Correction Code are stored in the rest.
Data Encoding
Let's talk about data encoding first. QR codes support the following encoding:
Numeric modeNumber encoding, from 0 to 9. If the number of numbers to be encoded is not a multiple of 3, the remaining 1 or 2 digits will be converted to 4 or 7 bits, the other three digits will be compiled into 10, 12, 14 bits, and the size of the two-dimensional code will also be checked (this is described in Table 3 below)
Alphanumeric modeCharacter encoding. Including 0-9, uppercase A to Z (no lower case), and symbol $ % * +-./: including space. These characters are mapped into a character index table. (SP is a space, Char is a character, and Value is its index Value) the encoding process is to group the two characters and convert them to the 45-digit format in the following table, then convert it to the binary value of 11 bits. If there is a single binary value, convert it to the binary value of 6 bits. The encoding mode and number of characters need to be compiled into 9, 11, or 13 binary values based on different Version sizes (Table 3 in the following Table)
Byte mode, Byte encoding, can be 0-255 ISO-8859-1 characters. Some QR code scanners can automatically detect whether it is UTF-8 encoding.
Kanji modeThis is Japanese and dubyte encoding. It can also be used for Chinese encoding. The encoding of Japanese and Chinese characters minus a value. For example, if the characters in 0X8140 to 0X9FFC are less than 8140, the characters in 0XE040 to 0 XEBBF must be subtracted from 0XC140, and the first two digits are multiplied by 0XC0, followed by the last two digits, finally, it is converted to the 13bit encoding. Example:
Extended Channel Interpretation (ECI) modeIt is mainly used for special character sets. Not all Scanners support this encoding.
Structured Append modeIt is used for mixed encoding. That is to say, this QR code contains multiple encoding formats.
FNC1 modeThis encoding method is mainly used by some special industries or industries. For example, GS1 barcode.
For simplicity, the following three types will not be discussed in this article.
In the following two tables,
- Table 2 is the "Number" of each encoding Format, which must be written in Format Information. Note: The Chinese version is 1101.
- Table 3 indicates the two-digit digits of a single encoding in two-digit mode, including numbers, characters, bytes, and Kanji. (There are various encoding specifications in the two-dimensional code specification, which will be mentioned later)
Below are some examples,
Example 1: digital encoding
In Version 1, if the error correction level is H, encoding: 01234567
1. Divide the preceding numbers into three groups: 012 345 67
2. convert them to binary values: 0000001100 to 345, 0101011001 to 1000011, and 67.
3. concatenate these three binary data: 0000001100 0101011001 1000011
4. Convert the number of digits to binary (version 1-H is 10 bits): the binary value of the eight digits is 0000001000.
5. Add the encoding steps 0001 and 4th of the number encoding to the front: 0001 0000001000 0000001100 0101011001 1000011
Example 2: character encoding
Encoding: AC-42 when the error correction level is H in Version 1
1. index the five notes of the AC-42 from the character index table (, 2)
2. Grouping: () (2)
3. convert each group to the binary value of 11 bits:
(462) 10*45 + 12 equals 00111001110
(1849) 41*45 + 4 equals 11100111001
(2) 2 to 000010
4. Connect these binary values: 00111001110 11100111001 000010
5. Convert the number of characters to binary (Version 1-H is 9 bits): 5 characters, 5 to 000000101 characters
6. Add the number code 0010 and 5th steps to the header: 0010 000000101 00111001110 11100111001 000010
Terminator and completion character
Assume that we have a hello world string to be encoded. Based on Example 2 above, we can get the following encoding,
Note: The error correction Code of the QR code is mainly implemented through Reed-Solomon error correction (ride-Solomon error correction algorithm. This algorithm is quite complex for me. It involves a lot of mathematical calculations, such as polynomial division, map the numbers between 1 and 255 to the nth power of 2 (0 <= n <= 255), The garova domain Galois Field, and other god-like things, as well as the mathematical formula for Correcting Errors Based on these foundations, because of my poor data base, it is too complicated for me, so I still have some questions at half past one, so I am still learning, I will not talk about these things here. Sorry. (Of course, if a friend understands it well, I would also like to ask for advice)
Final Encoding
Interspersed placement
If you think we can start drawing, you will be wrong. The chaotic technology of the QR code has not been completed yet. It also needs to combine the various codewords of the data code and the error code. The rule is as follows:
For the data code: take out the first codewords of each block and sort them in the order, then take the second one of the first block, and so on. For example, Data Codewords in the preceding example is as follows:
Block 1 |
67 |
85 |
70 |
134 |
87 |
38 |
85 |
194 |
119 |
50 |
6 |
18 |
6 |
103 |
38 |
|
Block 2 |
246 |
246 |
66 |
7 |
118 |
134 |
242 |
7 |
38 |
86 |
22 |
198 |
199 |
146 |
6 |
|
Block 3 |
182 |
230 |
247 |
119 |
50 |
7 |
118 |
134 |
87 |
38 |
82 |
6 |
134 |
151 |
50 |
7 |
Block 4 |
70 |
247 |
118 |
86 |
194 |
6 |
151 |
50 |
16 |
236 |
17 |
236 |
17 |
236 |
17 |
236 |
We first take the first column: 67,246,182, 70
Then take the second column: 67,246,182, 70, 85,246,230,247
And so on: 67,246,182, 70, 85,246,230,247 ......... ......... , 7,236
The same is true for error codes:
Block 1 |
213 |
199 |
11 |
45 |
115 |
247 |
241 |
223 |
229 |
248 |
154 |
117 |
154 |
111 |
86 |
161 |
111 |
39 |
Block 2 |
87 |
204 |
96 |
60 |
202 |
182 |
124 |
157 |
200 |
134 |
27 |
129 |
209 |
17 |
163 |
163 |
120 |
133 |
Block 3 |
148 |
116 |
177 |
212 |
76 |
133 |
75 |
242 |
238 |
76 |
195 |
230 |
189 |
10 |
108 |
240 |
192 |
141 |
Block 4 |
235 |
159 |
5 |
173 |
24 |
147 |
59 |
33 |
106 |
40 |
255 |
172 |
82 |
2 |
131 |
32 |
178 |
236 |
Same as the data code, the following results are obtained: Random, 87, 148,235,199,204,116,159 ,...... ...... 39,133,141,236
Then, put the two groups together (the error correction code is placed after the data code) to get:
67,246,182, 70, 85,246,230,247, 70, 66,247,118,134, 7,119, 86, 87,118, 50,194, 38,134, 7, 6, 85,242,118,151,194, 7,134, 50,119, 38, 87, 16, 50, 86, 38,236, 6, 22, 82, 17, 18,198, 6,236, 6,199,134, 17,103,146,151,236, 38, 6, 50, 17, 7,236,213, 87,148,235,199,204,116,159, 11, 96,177, 5, 45, 60,212,173,115,202, 76, 24,247,182,133,147,241,124, 75, 59,223,157,242, 33,229,200,238,106,248,134, 76, 40,154, 27,195,255,117,129,230,172,154,209,189, 82,111, 17, 10, 2, 86,163,108,131,161,163,240, 32,111,120,192,178, 39,133,141,236
Remainder Bits
Add Reminder Bits. For some Version QR codes, the length is not enough. Add Remainder Bits. For example, add seven bits to the QR codes of the 5Q Version, add zero to Remainder Bits. For how many Remainder bit versions are required, see the definition Table of Table-1 on page 15th of QR Code Spec.
Draw a QR code Diagram
Position Detection Pattern
First, draw the Position Detection Pattern on three corners.
Alignment Pattern
Then, draw the Alignment pattern.
For the location of Alignment, you can view the definition table of the Table-E.1 on page 1 of the QR Code Spec (The following table is an incomplete table)
Based on an example of Version8 in the above table (6, 24, 42)
Timing Pattern
Next is the Timing Pattern line (this is needless to say)
Format Information
Next is the blue part of Formation Information.
Format Information is the Information of 15 bits. The location of each bit is shown in: (Note that the Dark Module in the figure always appears)
These 15 bits include:
- Five data bits: two bits indicate the Error Correction Level used, and three bits indicate the Mask used.
- 10 Error Correction bits. It is calculated mainly through BCH Code.
Then 15 bits and 101010000010010 are required for XOR operations. In this way, we will not choose the 00 Error Correction level and the 000 Mask, resulting in all the white from the heavy, which will increase the difficulty of Image Recognition for our scanners.
The following is an example:
The Error Correction Level is shown in the following table:
The Mask pattern is shown in Table 23.
Version Information
The next step is Version Information (this encoding is required after Version 7), in the blue section.
Version Information has a total of 18 bits, including 6 bits versions and 12 bits Error Codes. The following is an example:
The filling position is as follows:
Data and Data Error Correction Codes
Then fill in our final encoding. The final encoding filling method is as follows: Fill in each of our bits from the lower left corner along the Red Line, 1 is black, 0 is white. If the preceding non-data zone is encountered, it is bypassed or skipped.
Mask Pattern
In this case, our figure is filled out. However, maybe those vertices are not balanced, so we still need to perform the Masking operation (which is not complicated) in the Spec, you can use eight masks for a QR code, as shown in the following figure. The formulas of each Mask are shown in the following figure. The so-called mask, to put it bluntly, is to perform XOR operations with the image generated above. The Mask performs XOR only with the data area and does not affect the functional area.
The Mask ID code is as follows: (I, j correspond to x, y respectively)
The following are some of the features of the Mask. We can see that the data of some Mask XOR becomes scattered.
The QR code after the Mask is the final figure.
Now, you can try to write the QR code program. Of course, you can use the Internet to find a library of Reed Soloman error correction algorithms, or you can see how other people's source code implements the code of the complex lock.