# The details and principles of the generation of two-dimensional codes

Source: Internet
Author: User

The details and principles of the generation of two-dimensional codes

QR Code, also known as the CODE,QR Universal Quick Response, is a mobile device in recent years a hyper-popular encoding method, it can save more information than the traditional bar code barcode, but also can represent more data types: characters, numbers, Japanese, Chinese and so on. These two days to learn about the QR Code image generation related details, think this thing is a cryptographic algorithm, write an article here, expose. For studious people to learn together.

For QR Code specification, refer to this pdf:http://raidenii.net/files/datasheets/misc/qr_code.pdf

Basic knowledge

First of all, let's talk about the QR code a total of 40 sizes. The official name is version versions. Version 1 is a matrix of x 21, version 2 is a matrix of x 25, version 3 is the size of 29, each add a Version, will increase the size of 4, the formula is: (V-1) + (V is the version number) highest version 40, (40- 1) *4+21 = 177, so the highest is a square of 177 x 177.

Let's look at a sample of two-dimensional code:

Positioning pattern
• Position Detection pattern is a positional pattern that is used to mark the rectangular size of a two-dimensional code. These three positioning patterns have white edges called separators for postion Detection Patterns. The reason three instead of four means three can be used to identify a rectangle.

• The Timing patterns is also used for positioning. The reason is that the two-dimensional code has 40 sizes, size is too large after the need for a standard line, or the scan may be swept crooked.

• Alignment Patterns only version 2 or more (including Version2) QR code needs this stuff, also for the purpose of positioning.

Functional data
• Format information exists in all dimensions and is used to store some formatted data.

• Version information on >= version 7, you need to reserve two 3 x 6 zones to store some release information.

Data code and error correction code
• In addition to the above, the rest of the place contains the data code code and the error Correction code error-correcting code.

Data encoding

Let's talk about data encoding first. The QR code supports the following encodings:

Numeric mode digital code, from 0 to 9. If the number of numbers that need to be encoded is not a multiple of 3, then the last remaining 1 or 2 digits will be converted to 4 or 7bits, then each of the other 3 digits will be 10,12,14bits, how long it takes to look at the size of the two-dimensional Code (table 3 below illustrates this)

alphanumeric mode character encoding. Includes 0-9, uppercase A to Z (no lowercase), and symbol \$% * +-. /: Include spaces. These characters are mapped into a single character index table. As shown below: (where the SP is a space, Char is a character, value is its index value) encoding the process is to group the character 22, and then turn to the following table of 45, and then into the binary of the 11bits, if there is a single, then turn into 6bits binary. The number of encoding modes and characters needs to be 9, 11, or 13 binary according to different version sizes (table 3 in the tables below)

Byte mode, byte-encoded, can be 0-255 iso-8859-1 characters. Some QR code scanners can automatically detect if a UTF-8 is encoded.

Kanji Mode This is a Japanese encoding and is also a double-byte encoding. Similarly, it can be used in Chinese encoding. The encoding of Japanese and Chinese characters subtracts a value. For example, the characters in 0x8140 to 0X9FFC are subtracted by 8140, the characters in 0xe040 to 0XEBBF are subtracted from the 0xc140, then the first two 16 binary bits of the result are multiplied by 0xc0, then the last two 16 bits are added, and finally the encoding of 13bit is converted. As an example:

Extended Channel Interpretation (ECI) mode is used primarily for special character sets. Not all scanners support this encoding.

Structured Append mode is used for mixed coding, meaning that the QR code contains multiple encoding formats.

FNC1 Mode This coding method is mainly used for some special industries or industries. Like GS1 barcodes and the like.

For simplicity, the following three types are not discussed in this article.

In the following two tables,

• Table 2 is the "number" in each encoding format, which is written in format information. Note: Chinese is 1101

• Table 3 shows the different versions (dimensions) of the QR Code, for, number, character, Byte, and kanji mode, for a single encoded number of 2 digits. (in the two-dimensional code specification, there are a variety of coding specification table, which will be mentioned later)

Let's look at a few examples,

Example one: Digital encoding

In the case of version 1, the error correction level is H, the code: 01234567

1. Divide the above figures into three groups: 012 345 67

2. Turn them into binary: 012 turns 0000001100, 345 turns 0101011001, 67 turns 1000011.

3. String The three binaries together: 0000001100 0101011001 1000011

4. Turn the number of numbers into binary (version 1-h is ten bits): 8 digit binary is 0000001000

5. Add the code of the digital coded mark 0001 and step 4th to the front: 0001 0000001000 0000001100 0101011001 1000011

Example two: Character encoding

In the case of version 1, the error correction level is H, the code: AC-42

1. Find the index of the AC-42 five note from the Character Index table (10,12,41,4,2)

2.22 Group: (10,12) (41,4) (2)

3. Turn each group into a 11bits binary:

(10,12) 10*45+12 equals 462 turns into 00111001110

(41,4) 41*45+4 equals 1849 turns into 11100111001

(2) equals 2 turns into 000010

4. Connect these binaries together: 00111001110 11100111001 000010

5. Turn the number of characters into binary (Version 1-h is 9 bits): 5 characters, 5 turns to 000000101

6. Encode the number of 0010 and 5th steps on the header: 0010 000000101 00111001110 11100111001 000010

Terminators and qualifiers

If we have a string of Hello world to encode, according to the example above, we can get the following code,

World
Coding Number of characters the code of HELLO
0010 000001011 01100001011 01111000110 10001011100 10110111000 10011010100 001101

World
Coding Number of characters the code of HELLOEnd
0010 000001011 01100001011 01111000110 10001011100 10110111000 10011010100 001101 0000
Press 8bits to rearrange

If all the encodings don't add up to 8 multiples, we're going to add enough 0 to the back, like there's 78 bits on top, so we'll add 2 0, and then we'll press 8 bits to get a good set:

00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 010000

Finally, if we have not reached our maximum bits limit, we will add some Padding Bytes, Padding Bytes is repeating the following two bytes:11101100 00010001 ( These two binaries turn into decimal 236 and 17, and I don't know why, only the spec is so written. For each version of the maximum bits limit for each error level, refer to the Table-7 table on page 28th to page 32 of the QR Code spec.

Let's say we need to encode the Q-error level of version 1, then it needs 104 bits, and we have 80 bits above, so we need to add 24 bits, which requires 3 padding Bytes, and we're adding three, so we get the following code:

00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 01000000 11101100 00010001 11101100

The above code is the data code, called the codewords, each 8bits called a codeword, we also have to add error-correcting information to these data code.

Error correcting code

Above we talked about some error correction levels, error Correction code level, the QR code has four levels of error correction, which is why the two-dimensional code is incomplete can also be swept out, that is why someone in the central location of the QR code to add the icon.

Error correction capacity
L Level 7% of Loadline can be modified
M level 15% of Loadline can be modified
Q level 25% of Loadline can be modified
H level 30% of Loadline can be modified

So, how does the QR code for the data code? First of all, we need to group the data code, that is, divided into different blocks, and then error-correcting code for each block, for how to group, we can view the QR Code Spec page 33rd to page 44 of the TABLE-13 to Table-22 definition table. Note the last two columns:

• Number of Error Code Correction Blocks : How many blocks need to be divided.

• Error Correction code per Blocks: The number of code in each block, the number of so-called code, that is, how many 8bits bytes.

For example: the above version 5 + Q error correction level: Requires 4 blocks (2 blocks for a group, a total of two groups), the first group of two blocks in each of the 15 bits data + 9 bits of each error-correcting code ( Note: The codewords in the table is a 8bits byte) (again note: The formula for the last example (C, K, R) is: c = k + 2 * r, because the hind note explains: The size of the error-correcting code is less than half of the error-correcting code)

Give an example of a 5-q (because the binary write will make the table too large, so I used the decimal, we can see each piece of error correcting code has 18 codewords, that is, 18 8bits binary number)

tr> 163 >
Group block data error correcting code for each block
1 1  134 194 6 6 119 ( 199 11) 4 5 247 241 223 229 248 154 117 154 111 66 161 111 246
2 246 7 118 134 242 7 198 199 146 6 204 (202 182 124 157) 134 129 209
2 1 182 247 119 7 118 134 6 134 151 7 148 116 177 212 133 238 195 189 118 108 192
2 70 247 86 194 6 1 236 236-236-236 235 159 5 173 147 106-255 172 2 131 + 178 236

Note: The two-dimensional code error correction code is mainly through the Reed-solomon error correction (the German-Solomon error correction algorithm) to achieve. For this algorithm, for me is quite complex, there are a lot of mathematical calculations, such as: polynomial division, the number of 1-255 maps to 2 of the N-square (0<=n<=255) of the Gamma Rovavic Galois field and other things like God, And based on these basic error correction mathematical formula, because my data base is poor, for me is too complex, so I womb still a little understand, still in the study, so, I do not start to say these things here. Please forgive us. (Of course, if a friend is very clear, also learn to teach me)

Final coding interspersed with placement

If you think we can start drawing, you're wrong. The chaos of the QR code is not over yet, and it is also the codewords of the data code and the error-correcting code. How to alternate, the rules are as follows:

For data code: the first codewords of each block is first taken out, arranged in the order of smoothness, then the second of the first block, and so on. For example, the data codewords in the examples above are as follows:

 Block 1 67 85 70 134 87 38 85 194 119 50 6 18 6 103 38 Block 2 246 246 66 7 118 134 242 7 38 86 22 198 199 146 6 Block 3 182 230 247 119 50 7 118 134 87 38 82 6 134 151 50 7 Block 4 70 247 118 86 194 6 151 50 16 236 17 236 17 236 17 236

We'll take the first column: 67, 246, 182, 70

Then take the second column: 67, 246, 182, 70, 85,246,230, 247

So analogy: 67, 246, 182, 70, 85,246,230, 247 ....., 38,6,50,17,7,236...

The same is true for error correction codes:

 Block 1 213 199 11 45 115 247 241 223 229 248 154 117 154 111 86 161 111 39 Block 2 87 204 96 60 202 182 124 157 200 134 27 129 209 17 163 163 120 133 Block 3 148 116 177 212 76 133 75 242 238 76 195 230 189 10 108 240 192 141 Block 4 235 159 5 173 24 147 59 33 106 40 255 172 82 2 131 32 178 236

As with the data code, get: 213, 87,148,235,199,204,116,159, ..... 39,133,141,236

Then, put the two groups together (Error correction code after the data) to get:

67, 246, 182, 70, 85, 246, 230, 247, 70, 66, 247, 118, 134, 7, 119, 86, 87, 118, 50, 194, 38, 134, 7, 6, 85, 242, 118, 151 , 194, 7, 134, 50, 119, 38, 87, 16, 50, 86, 38, 236, 6, 22, 82, 17, 18, 198, 6, 236, 6, 199, 134, 17, 103, 146, 151, 236, 38, 6, 50, 17, 7, 236, 213, 87, 148, 235, 199, 204, 116, 159, 11, 96, 177, 5, 45, 60, 212, 173, 115, 202, 76, 24, 247, 182 , 133, 147, 241, 124, 75, 59, 223, 157, 242, 33, 229, 200, 238, 106, 248, 134, 76, 40, 154, 27, 195, 255, 117, 129, 230, 1 72, 154, 209, 189, 82, 111, 17, 10, 2, 86, 163, 108, 131, 161, 163, 240, 32, 111, 120, 192, 178, 39, 133, 141, 236

This is our data area.

Remainder Bits

Finally add reminder bits, for some version of the QR, the above is not enough length, but also add remainder Bits, such as: The above 5Q version of the two-dimensional code, but also add 7 Bits,remainder Bits plus 0 is good. You can refer to the TABLE-1 Definition table on page 15th of the QR Code spec for the number of remainder bits that are required for the version.

Draw two-dimensional code diagram position Detection pattern

First, the position detection pattern is drawn on three corners. (regardless of version, the size of this pattern is so large)

Alignment Pattern

Then, the alignment pattern is drawn (regardless of version, the size of this pattern is so large)

For the location of alignment, you can view the definition table for TABLE-E.1 on page 81st of the QR Code spec (the table below is the incomplete table)

is based on an example of Version8 in the above table (6,24,42)

Timing Pattern

Next is the line of timing pattern (this doesn't have to be said)

Format Information

And then there's the blue part of formation information.

Format information is a 15 bits of information, each bit of the location as shown: (Note the dark Module in the figure, which is always present)

These 15 bits include:

• 5 Data bits: Where 2 bits is used to indicate what error Correction level is used, and 3 bits indicates what mask to use

• 10 error-correcting bits. Mainly through BCH code to calculate

Then 15 bits also do an XOR operation with 101010000010010. This ensures that we do not use 00 of the error correction level and 000 mask, resulting in all white, which will increase the difficulty of our scanner's image recognition.

Here is an example:

About the error Correction level is shown in the following table:

The mask pattern is shown in table 23 below.

Version Information

The next step is the blue part of version information (which requires this code after 7).

Version information is a total of 18 bits, which includes 6 bits of the version number and 12 bits of the error-correcting code, the following is an example:

And its fill position is as follows:

Data and data error correcting codes

Then we fill in our final code, and the final code is populated as follows: from the bottom left, we fill our bits,1 with black, 0 is white. If the above non-data area is encountered, it is bypassed or skipped.

This way down, our map is filled out, but perhaps those points are not balanced, if a large area of white space or black block, will tell us the difficulty of scanning recognition. So, we have to do masking operation (rely on, it is not complicated) QR Spec said, QR has 8 masks you can use, as follows: Each mask of the formula under each diagram. The so-called mask, plainly, is and the above generated diagram do XOR operation. Mask will only be XOR with the data area and will not affect the ribbon. ( Note: The selection of a suitable mask is also an algorithm )

The identification code for its mask is as follows: (where the i,j corresponds to x, Y, respectively)

Here's what the mask looks like, and we can see that some of the mask XOR data becomes more fragmented.

The two-dimensional code after mask becomes the final figure.

Well, you can try to write the QR coding program, of course, you can use the web to find a reed Soloman error correction algorithm Library, or see how others source code is how to implement the code of the complex.

(End of full text)

The details and principles of the generation of two-dimensional codes

Related Keywords:

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

## A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

• #### Sales Support

1 on 1 presale consultation

• #### After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

• Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.