The details and principles of the generation of twodimensional codes
QR Code, also known as the CODE,QR Universal Quick Response, is a mobile device in recent years a hyperpopular encoding method, it can save more information than the traditional bar code barcode, but also can represent more data types: characters, numbers, Japanese, Chinese and so on. These two days to learn about the QR Code image generation related details, think this thing is a cryptographic algorithm, write an article here, expose. For studious people to learn together.
For QR Code specification, refer to this pdf:http://raidenii.net/files/datasheets/misc/qr_code.pdf
Basic knowledge
First of all, let's talk about the QR code a total of 40 sizes. The official name is version versions. Version 1 is a matrix of x 21, version 2 is a matrix of x 25, version 3 is the size of 29, each add a Version, will increase the size of 4, the formula is: (V1) + (V is the version number) highest version 40, (40 1) *4+21 = 177, so the highest is a square of 177 x 177.
Let's look at a sample of twodimensional code:
Positioning pattern
Position Detection pattern is a positional pattern that is used to mark the rectangular size of a twodimensional code. These three positioning patterns have white edges called separators for postion Detection Patterns. The reason three instead of four means three can be used to identify a rectangle.
The Timing patterns is also used for positioning. The reason is that the twodimensional code has 40 sizes, size is too large after the need for a standard line, or the scan may be swept crooked.
Alignment Patterns only version 2 or more (including Version2) QR code needs this stuff, also for the purpose of positioning.
Functional data
Format information exists in all dimensions and is used to store some formatted data.
Version information on >= version 7, you need to reserve two 3 x 6 zones to store some release information.
Data code and error correction code
Data encoding
Let's talk about data encoding first. The QR code supports the following encodings:
Numeric mode digital code, from 0 to 9. If the number of numbers that need to be encoded is not a multiple of 3, then the last remaining 1 or 2 digits will be converted to 4 or 7bits, then each of the other 3 digits will be 10,12,14bits, how long it takes to look at the size of the twodimensional Code (table 3 below illustrates this)
alphanumeric mode character encoding. Includes 09, uppercase A to Z (no lowercase), and symbol $% * +. /: Include spaces. These characters are mapped into a single character index table. As shown below: (where the SP is a space, Char is a character, value is its index value) encoding the process is to group the character 22, and then turn to the following table of 45, and then into the binary of the 11bits, if there is a single, then turn into 6bits binary. The number of encoding modes and characters needs to be 9, 11, or 13 binary according to different version sizes (table 3 in the tables below)
Byte mode, byteencoded, can be 0255 iso88591 characters. Some QR code scanners can automatically detect if a UTF8 is encoded.
Kanji Mode This is a Japanese encoding and is also a doublebyte encoding. Similarly, it can be used in Chinese encoding. The encoding of Japanese and Chinese characters subtracts a value. For example, the characters in 0x8140 to 0X9FFC are subtracted by 8140, the characters in 0xe040 to 0XEBBF are subtracted from the 0xc140, then the first two 16 binary bits of the result are multiplied by 0xc0, then the last two 16 bits are added, and finally the encoding of 13bit is converted. As an example:
Extended Channel Interpretation (ECI) mode is used primarily for special character sets. Not all scanners support this encoding.
Structured Append mode is used for mixed coding, meaning that the QR code contains multiple encoding formats.
FNC1 Mode This coding method is mainly used for some special industries or industries. Like GS1 barcodes and the like.
For simplicity, the following three types are not discussed in this article.
In the following two tables,
Table 2 is the "number" in each encoding format, which is written in format information. Note: Chinese is 1101
Table 3 shows the different versions (dimensions) of the QR Code, for, number, character, Byte, and kanji mode, for a single encoded number of 2 digits. (in the twodimensional code specification, there are a variety of coding specification table, which will be mentioned later)
Let's look at a few examples,
Example one: Digital encoding
In the case of version 1, the error correction level is H, the code: 01234567
1. Divide the above figures into three groups: 012 345 67
2. Turn them into binary: 012 turns 0000001100, 345 turns 0101011001, 67 turns 1000011.
3. String The three binaries together: 0000001100 0101011001 1000011
4. Turn the number of numbers into binary (version 1h is ten bits): 8 digit binary is 0000001000
5. Add the code of the digital coded mark 0001 and step 4th to the front: 0001 0000001000 0000001100 0101011001 1000011
Example two: Character encoding
In the case of version 1, the error correction level is H, the code: AC42
1. Find the index of the AC42 five note from the Character Index table (10,12,41,4,2)
2.22 Group: (10,12) (41,4) (2)
3. Turn each group into a 11bits binary:
(10,12) 10*45+12 equals 462 turns into 00111001110
(41,4) 41*45+4 equals 1849 turns into 11100111001
(2) equals 2 turns into 000010
4. Connect these binaries together: 00111001110 11100111001 000010
5. Turn the number of characters into binary (Version 1h is 9 bits): 5 characters, 5 turns to 000000101
6. Encode the number of 0010 and 5th steps on the header: 0010 000000101 00111001110 11100111001 000010
Terminators and qualifiers
If we have a string of Hello world to encode, according to the example above, we can get the following code,
World
Coding 
Number of characters 
the code of HELLO 
0010 
000001011 
01100001011 01111000110 10001011100 10110111000 10011010100 001101 
We'll also add the Terminator:
World
Coding 
Number of characters 
the code of HELLO  End 
0010 
000001011 
01100001011 01111000110 10001011100 10110111000 10011010100 001101 
0000 
Press 8bits to rearrange
If all the encodings don't add up to 8 multiples, we're going to add enough 0 to the back, like there's 78 bits on top, so we'll add 2 0, and then we'll press 8 bits to get a good set:
00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 010000
Complement code (Padding Bytes)
Finally, if we have not reached our maximum bits limit, we will add some Padding Bytes, Padding Bytes is repeating the following two bytes:11101100 00010001 ( These two binaries turn into decimal 236 and 17, and I don't know why, only the spec is so written. For each version of the maximum bits limit for each error level, refer to the Table7 table on page 28th to page 32 of the QR Code spec.
Let's say we need to encode the Qerror level of version 1, then it needs 104 bits, and we have 80 bits above, so we need to add 24 bits, which requires 3 padding Bytes, and we're adding three, so we get the following code:
00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 01000000 11101100 00010001 11101100
The above code is the data code, called the codewords, each 8bits called a codeword, we also have to add errorcorrecting information to these data code.
Error correcting code
Above we talked about some error correction levels, error Correction code level, the QR code has four levels of error correction, which is why the twodimensional code is incomplete can also be swept out, that is why someone in the central location of the QR code to add the icon.
Error correction capacity 
L Level 
7% of Loadline can be modified 
M level 
15% of Loadline can be modified 
Q level 
25% of Loadline can be modified 
H level 
30% of Loadline can be modified 
So, how does the QR code for the data code? First of all, we need to group the data code, that is, divided into different blocks, and then errorcorrecting code for each block, for how to group, we can view the QR Code Spec page 33rd to page 44 of the TABLE13 to Table22 definition table. Note the last two columns:
Number of Error Code Correction Blocks : How many blocks need to be divided.
Error Correction code per Blocks: The number of code in each block, the number of socalled code, that is, how many 8bits bytes.
For example: the above version 5 + Q error correction level: Requires 4 blocks (2 blocks for a group, a total of two groups), the first group of two blocks in each of the 15 bits data + 9 bits of each errorcorrecting code ( Note: The codewords in the table is a 8bits byte) (again note: The formula for the last example (C, K, R) is: c = k + 2 * r, because the hind note explains: The size of the errorcorrecting code is less than half of the errorcorrecting code)
Give an example of a 5q (because the binary write will make the table too large, so I used the decimal, we can see each piece of error correcting code has 18 codewords, that is, 18 8bits binary number)
tr>
163 >
Group 
block 
data 
error correcting code for each block 
1 
1 
[213] 134 194 6 6 119 ( 
199 11) 4 5 247 241 223 229 248 154 117 154 111 66 161 111 246 
2 
246 7 118 134 242 7 198 199 146 6 
204 (202 182 124 157) 134 129 209 
2 
1 
182 247 119 7 118 134 6 134 151 7 
148 116 177 212 133 238 195 189 118 108 192 
2 
70 247 86 194 6 1 236 236236236 
235 159 5 173 147 106255 172 2 131 + 178 236 
Note: The twodimensional code error correction code is mainly through the Reedsolomon error correction (the GermanSolomon error correction algorithm) to achieve. For this algorithm, for me is quite complex, there are a lot of mathematical calculations, such as: polynomial division, the number of 1255 maps to 2 of the Nsquare (0<=n<=255) of the Gamma Rovavic Galois field and other things like God, And based on these basic error correction mathematical formula, because my data base is poor, for me is too complex, so I womb still a little understand, still in the study, so, I do not start to say these things here. Please forgive us. (Of course, if a friend is very clear, also learn to teach me)
Final coding interspersed with placement
If you think we can start drawing, you're wrong. The chaos of the QR code is not over yet, and it is also the codewords of the data code and the errorcorrecting code. How to alternate, the rules are as follows:
For data code: the first codewords of each block is first taken out, arranged in the order of smoothness, then the second of the first block, and so on. For example, the data codewords in the examples above are as follows:
Block 1 
67 
85 
70 
134 
87 
38 
85 
194 
119 
50 
6 
18 
6 
103 
38 

Block 2 
246 
246 
66 
7 
118 
134 
242 
7 
38 
86 
22 
198 
199 
146 
6 

Block 3 
182 
230 
247 
119 
50 
7 
118 
134 
87 
38 
82 
6 
134 
151 
50 
7 
Block 4 
70 
247 
118 
86 
194 
6 
151 
50 
16 
236 
17 
236 
17 
236 
17 
236 
We'll take the first column: 67, 246, 182, 70
Then take the second column: 67, 246, 182, 70, 85,246,230, 247
So analogy: 67, 246, 182, 70, 85,246,230, 247 ....., 38,6,50,17,7,236...
The same is true for error correction codes:
Block 1 
213 
199 
11 
45 
115 
247 
241 
223 
229 
248 
154 
117 
154 
111 
86 
161 
111 
39 
Block 2 
87 
204 
96 
60 
202 
182 
124 
157 
200 
134 
27 
129 
209 
17 
163 
163 
120 
133 
Block 3 
148 
116 
177 
212 
76 
133 
75 
242 
238 
76 
195 
230 
189 
10 
108 
240 
192 
141 
Block 4 
235 
159 
5 
173 
24 
147 
59 
33 
106 
40 
255 
172 
82 
2 
131 
32 
178 
236 
As with the data code, get: 213, 87,148,235,199,204,116,159, ..... 39,133,141,236
Then, put the two groups together (Error correction code after the data) to get:
67, 246, 182, 70, 85, 246, 230, 247, 70, 66, 247, 118, 134, 7, 119, 86, 87, 118, 50, 194, 38, 134, 7, 6, 85, 242, 118, 151 , 194, 7, 134, 50, 119, 38, 87, 16, 50, 86, 38, 236, 6, 22, 82, 17, 18, 198, 6, 236, 6, 199, 134, 17, 103, 146, 151, 236, 38, 6, 50, 17, 7, 236, 213, 87, 148, 235, 199, 204, 116, 159, 11, 96, 177, 5, 45, 60, 212, 173, 115, 202, 76, 24, 247, 182 , 133, 147, 241, 124, 75, 59, 223, 157, 242, 33, 229, 200, 238, 106, 248, 134, 76, 40, 154, 27, 195, 255, 117, 129, 230, 1 72, 154, 209, 189, 82, 111, 17, 10, 2, 86, 163, 108, 131, 161, 163, 240, 32, 111, 120, 192, 178, 39, 133, 141, 236
This is our data area.
Remainder Bits
Finally add reminder bits, for some version of the QR, the above is not enough length, but also add remainder Bits, such as: The above 5Q version of the twodimensional code, but also add 7 Bits,remainder Bits plus 0 is good. You can refer to the TABLE1 Definition table on page 15th of the QR Code spec for the number of remainder bits that are required for the version.
Draw twodimensional code diagram position Detection pattern
First, the position detection pattern is drawn on three corners. (regardless of version, the size of this pattern is so large)
Alignment Pattern
Then, the alignment pattern is drawn (regardless of version, the size of this pattern is so large)
For the location of alignment, you can view the definition table for TABLEE.1 on page 81st of the QR Code spec (the table below is the incomplete table)
is based on an example of Version8 in the above table (6,24,42)
Timing Pattern
Next is the line of timing pattern (this doesn't have to be said)
Format Information
And then there's the blue part of formation information.
Format information is a 15 bits of information, each bit of the location as shown: (Note the dark Module in the figure, which is always present)
These 15 bits include:
5 Data bits: Where 2 bits is used to indicate what error Correction level is used, and 3 bits indicates what mask to use
10 errorcorrecting bits. Mainly through BCH code to calculate
Then 15 bits also do an XOR operation with 101010000010010. This ensures that we do not use 00 of the error correction level and 000 mask, resulting in all white, which will increase the difficulty of our scanner's image recognition.
Here is an example:
About the error Correction level is shown in the following table:
The mask pattern is shown in table 23 below.
Version Information
The next step is the blue part of version information (which requires this code after 7).
Version information is a total of 18 bits, which includes 6 bits of the version number and 12 bits of the errorcorrecting code, the following is an example:
And its fill position is as follows:
Data and data error correcting codes
Then we fill in our final code, and the final code is populated as follows: from the bottom left, we fill our bits,1 with black, 0 is white. If the above nondata area is encountered, it is bypassed or skipped.
Mask pattern
This way down, our map is filled out, but perhaps those points are not balanced, if a large area of white space or black block, will tell us the difficulty of scanning recognition. So, we have to do masking operation (rely on, it is not complicated) QR Spec said, QR has 8 masks you can use, as follows: Each mask of the formula under each diagram. The socalled mask, plainly, is and the above generated diagram do XOR operation. Mask will only be XOR with the data area and will not affect the ribbon. ( Note: The selection of a suitable mask is also an algorithm )
The identification code for its mask is as follows: (where the i,j corresponds to x, Y, respectively)
Here's what the mask looks like, and we can see that some of the mask XOR data becomes more fragmented.
The twodimensional code after mask becomes the final figure.
Well, you can try to write the QR coding program, of course, you can use the web to find a reed Soloman error correction algorithm Library, or see how others source code is how to implement the code of the complex.
(End of full text)
The details and principles of the generation of twodimensional codes