Detailed explanation of the principle and detail of two-dimensional code generation "multiple graphs"

Last Update:2017-01-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The dimensional barcode/Two D code (2-dimensional bar code) is a Black-and-white graphic recording data symbol information which is distributed in a plane (two-dimensional direction) by a certain geometry. Cleverly utilizes the "0", "1" which compose the logic foundation of the computer in the Code compilation The concept of bit stream, using several geometric forms corresponding to the binary to represent literal numerical information, automatic reading through an image input device or photoelectric scanning device to realize automatic information processing: It has some common features of barcode technology: Each code system has its specific character set; Each character occupies a certain width , and has a certain calibration function. At the same time, it also has the features of automatic recognition of different lines of information, and the processing of graphics rotation changes.

Basic knowledge

First, let's say the two-dimensional code has a total of 40 dimensions. The official name is version. Version 1 is the matrix of x 21, version 2 is the matrix of x 25, version 3 is the size of 29, and each additional version increases the size of 4, the formula is: (V-1) *4 + (V is version number) highest version 40, (40- 1) *4+21 = 177, so the highest is the square of 177 x 177.

Now let's look at a two-dimensional code sample:

Positioning pattern

The Position detection pattern is an anchor that is used to mark the size of a two-dimensional code rectangle. These three positioning patterns have a white edge called separators for postion detection Patterns. The reason that three instead of four means three can identify a rectangle.

Timing patterns is also used for positioning. The reason is that the two-dimensional code has 40 kinds of sizes, the size of too large need to have a standard line, or scan the time may be swept crooked.

Alignment Patterns only two-dimensional codes with version 2 (including Version2) need this stuff, as well as for positioning.

Functional data

The format information exists in all dimensions and is used for storing some formatted data.

Version information in >= version 7, you need to reserve two 3 x 6 of the region to store some versions of information.

Data codes and error-Correcting codes

In addition to the places mentioned above, the remaining places contain the data code and error correction code error codes.

Data encoding

Let's talk about data coding first. The QR code supports the following encoding:

Numeric mode digit code, from 0 to 9. If the number of digits you want to encode is not a multiple of 3, then, the last remaining 1 or 2 digits will be converted to 4 or 7bits, then each of the other 3 digits will be compiled into 10,12,14bits, how long it will take to look at the dimensions of the two-dimensional Code (table 3 below illustrates this)

Alphanumeric mode character encoding. Includes 0-9, uppercase A to Z (no lowercase), and symbol $% * +–. /: Include spaces. These characters are mapped into a character index table. as follows: (where the SP is a space, Char is a character, value is its index value) the process of encoding is to group the characters 22, then turn to the 45 in the following table, and then into the 11bits binary, if there is a single drop, then turn into 6bits binary. The number of encoding patterns and characters needs to be 9, 11, or 13 binary according to different version sizes (table 3 in the following list)

byte mode, which can be a 0-255 iso-8859-1 character. Some two-dimensional code scanners can automatically detect whether the UTF-8 is encoded.

Kanji mode, which is a Japanese encoding, is also a double-byte encoding. Similarly, it can be used in Chinese encoding. The encoding of Japanese and Chinese characters subtracts a value. For example: The characters in 0x8140 to 0X9FFC are subtracted by 8140, the characters in 0xe040 to 0XEBBF are subtracted from the 0xc140, then the first two 16 digits are then multiplied by the 0xc0, then added to the last two 16 digits, and finally converted to a 13bit encoding. The following diagram example:

Extended Channel Interpretation (ECI) mode is primarily used for special character sets. Not all scanners support this encoding.

Structured Append mode is used for mixed encoding, that is, the two-dimensional code contains a number of encoding formats.

FNC1 mode This type of coding is mainly for some special industries or industries. Like GS1 barcodes and the like.

For simplicity, the next three kinds are not discussed in this article.

In the following two tables,

Table 2 is a "number" of each encoded format, which is to be written in the format information. Note: Chinese is 1101

Table 3 shows the different versions (dimensions) of two-dimensional codes, for, numbers, characters, bytes, and kanji modes, for a single encoded 2-digit number. (in the specification of two-dimensional code, there are a variety of coding specifications table, also mentioned later)

Let's look at a few examples here,

Example one: Digital encoding

In the case of version 1, with an error-correcting level of H, the code: 01234567

1. Divide the above figures into three groups: 012 345 67

2. Turn them into binary: 012 turn into 0000001100; 345 turn into 0101011001; 67 turn into 1000011.

3. String up the three binary systems: 0000001100 0101011001 1000011

4. Convert number to binary (version 1-h is ten bits): 8 digits binary is 0000001000

5. Add the code to the logo 0001 and step 4th to the front: 0001 0000001000 0000001100 0101011001 1000011

Example two: Character encoding

In the case of version 1, the error-correcting level is H, encoding: AC-42

1. Find the index of the five note AC-42 from the character Index table (10,12,41,4,2)

2.22 groupings: (10,12) (41,4) (2)

3. Convert each group into a 11bits binary:

(10,12) 10*45+12 equal to 462 turn into 00111001110

(41,4) 41*45+4 equal to 1849 turn into 11100111001

(2) equals 2 to 000010

4. Connect these binaries: 00111001110 11100111001 000010

5. Convert the number of characters to binary (Version 1-h to 9 bits): 5 characters, 5 to 000000101

6. Add code ID 0010 and step 5th on the head code: 0010 000000101 00111001110 11100111001 000010

Terminator and padded character

If we have a hello world string to encode, according to example two above, we can get the following code,

Number of coded characters Hello World code

Coding	Number of characters	HELLO World's Code
0010	000001011	01100001011 01111000110 10001011100 10110111000 10011010100 001101

And we're going to add the Terminator:

Coding	Number of characters	HELLO World's Code	End
0010	000001011	01100001011 01111000110 10001011100 10110111000 10011010100 001101	0000

Press 8bits to rearrange

If all the encodings add up to not 8 multiples, we'll add enough 0 to the back, like 78 bits on the top, so we'll add 2 0, then we'll divide the group by 8 bits:

00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 01000000

Padded code (Padding Bytes)

Finally, if we haven't reached our maximum bits limit, we'll add some padded code (Padding Bytes), Padding Bytes is to repeat the following two bytes:11101100 00010001 ( The two binary turns to decimal is 236 and 17, and I don't know why, but I just know that spec is so. The maximum bits limit for each error-correcting level for each version can be found in the Table-7 table on page 28th to 32 of the QR Code spec.

Let's say we need to encode the Q-error level of version 1, which requires 104 bits, and we only have 80 bits on top, so we need 24 bits, which means 3 padding Bytes, we add three, and we get the following code:

00100000 01011011 00001011 01111000 11010001 01110010 11011100 01001101 01000011 01000000 11101100 00010001-11101100

The above code is the data code, called the codewords, each 8bits is called a codeword, we also have to these data code plus error correction information.

Error correction Code

Above we talked about some error correction level, error correction code levels, two-dimensional code has four kinds of levels of error correction, which is why the two-dimensional code has a disability can also be swept out, that is why someone in the center of the two-dimensional code to add the icon.

Error correction capacity
L Level	7% of the codewords can be fixed.
M level	15% of the codewords can be fixed.
Q level	25% of the codewords can be fixed.
H level	30% of the codewords can be fixed.

So, how does QR add error-Correcting code to data codes? First, we need to group the data code, which is divided into different blocks, and then the block for error-correcting code, for how to group, we can view the QR code Spec's 33rd page to page 44 Table-13 to Table-22 's definition table. Note the last two columns:

Number of Error Code correction Blocks: How many blocks need to be divided.

Error correction Code per Blocks: The number of code in each block, the number of so-called code, that is, how many bytes of 8bits.

For example: the above version 5 + q error-correcting level: Requires 4 blocks (2 blocks for one group, two groups), two blocks in the first group 15 bits data + each 9 bits error correction code ( Note: The codewords in the table is a 8bits byte (note: The formula in the last example (C, K, R) is: c = k + 2 * r, because the back note explains that the error-correcting code is less than half of the error-correcting code)

The following illustration gives an example of a 5-q (because the binary will make the table too large, so I use the decimal, we can see that each piece of error correction code has 18 codewords, that is, 18 8bits binary number)

Group	Block	Data	error-Correcting codes for each block
1	1	67 85 70 134, 87 38 85 194 119 50 6 18 6 103 38	213 199 11 45 115 247 241 223 229 248 154 117 154 111 86 161 111-39
1	2	246 246 66 7, 118 134 242 7 38 86 22 198 199 146 6	87 204 96 60 202 182 124 157 200 134 27 129 209 17 163 163 120-133
2	1	182 230 247 119 50 7 118 134 87 38 82 6 134 151-50 7	148 116 177 212 76 133 75 242 238 76 195 230 189 10 108 240 192-141
2	2	70 247 118 86 194 6 151 50 16 236 17 236 17 236-17 236	235 159 5 173 24 147 59 33 106 40 255 172 82 2 131 32 178-236

Note: The error-correcting codes of two-dimensional codes are mainly realized by Reed-solomon error correction (Li-Luo-gate error correction algorithm). For this algorithm, it's quite complicated for me, and there are a lot of mathematical calculations, such as polynomial division, which maps 1-255 of numbers to 2 0<=n<=255 gamma Rovavic Galois field. And based on these basic error correction mathematical formula, because my data base is poor, for me is too complex, so I hurry still a bit not understand, still in the study, so, I do not start to say these things here. I would also ask you to forgive me. (Of course, if a friend is very clear, also ask me to teach)

Final encoding

Interspersed placement

If you think we can start drawing, you're wrong. Two-dimensional code of Chaos technology has not finished, it will also be the data code and error-correcting code of the various codewords alternately put together. How to alternate, the rules are as follows:

For data code: the first codewords of each block is first taken out in order of smoothness, and then the second of the first block, and so on. For example, the data codewords in the above examples is as follows:

Block 1	67	85	70	134	87	38	85	194	119	50	6	18	6	103	38
Block 2	246	246	66	7	118	134	242	7	38	86	22	198	199	146	6
Block 3	182	230	247	119	50	7	118	134	87	38	82	6	134	151	50	7
Block 4	70	247	118	86	194	6	151	50	16	236	17	236	17	236	17	236

We'll take the first column: 67, 246, 182, 70.

And then take the second column: 67, 246, 182, 70, 85,246,230, 247.

So the analogy: 67, 246, 182, 70, 85,246,230, 247 ..., 38,6,50,17,7,236.....

The same is true for error-Correcting codes:

Block 1	213	199	11	45	115	247	241	223	229	248	154	117	154	111	86	161	111	39
Block 2	87	204	96	60	202	182	124	157	200	134	27	129	209	17	163	163	120	133
Block 3	148	116	177	212	76	133	75	242	238	76	195	230	189	10	108	240	192	141
Block 4	235	159	5	173	24	147	59	33	106	40	255	172	82	2	131	32	178	236

As with the data code, get: 213, 87,148,235,199,204,116,159, ..... 39,133,141,236

The two groups are then put together (the error-correcting code is followed by the data codes):

67, 246, 182, 70, 85, 246, 230, 247, 70, 66, 247, 118, 134, 7, 119, 86, 87, 118, 50, 194, 38, 134, 7, 6, 85, 242, 118, 151 , 194, 7, 134, 50, 119, 38, 87, 16, 50, 86, 38, 236, 6, 22, 82, 17, 18, 198, 6, 236, 6, 199, 134, 17, 38, 6, 50, 17, 7, 236, 213, 87, 148, 235, 199, 204, 116, 159, 11, 96, 177, 5, 45, 60, 212, 173, 115, 202, 76, 24, 247, 182 , 133, 147, 241, 124, 75, 59, 223, 157, 242, 33, 229, 200, 238, 106, 248, 134, 76, 40, 154, 27, 195, 255, 117, 129, 230, 1 72, 154, 209, 189, 82, 111, 17, 10, 2, 86, 163, 108, 131, 161, 163, 240, 32, 111, 120, 192, 178, 39, 133, 141, 236

This is our data area.

Remainder Bits

Finally plus reminder bits, for some version of QR, the above is not enough length, but also add remainder Bits, such as: The 5Q version of the two-dimensional code, plus 7 Bits,remainder Bits plus 0. Refer to the TABLE-1 Definition table on page 15th of the QR Code spec for which version requires a remainder bit.

Draw two dimensional code map

Position detection pattern

First, draw the position detection pattern on three corners. (regardless of version, the size of this pattern is so large)

Alignment pattern

Then, alignment the pattern (no matter what version, the size of the pattern is so large)

For the location of the alignment, you can view the TABLE-E.1 definition table on page 81st of the QR Code spec (The following table is an incomplete table)

The following figure is an example of Version8 based on the above table (6,24,42)

Timing pattern

And then there's the line of timing pattern (that's not much to say).

Format Information

Then comes the formation information, the blue part of the image below.

The Format information is a 15 bits piece of information, where each bit is shown in the following illustration: (Note the dark Module in the diagram, which is always present)

These 15 bits include:

5 Data bits: of which, 2 bits are used to indicate what error correction level is used, and 3 bits indicate what kind of mask to use

10 error-correcting bits. Mainly through BCH code to calculate

Then 15 bits also have an XOR operation with 101010000010010. This will ensure that we do not use 00 error correction level and 000 mask, resulting in all white, which increases the difficulty of image recognition of our scanners.

Here is an example:

The error correction level is shown in the following table:

About the mask pattern as shown in Table 23 later.

Version Information

The next step is version information (which is required after release 7), in the blue section of the following figure.

Version information is a total of 18 bits, including 6 bits version number and 12 bits error correction codes, and here is an example:

And its fill position is as follows:

Data and data error-correcting codes

Then we fill in our final code, and the final encoding is filled in the following ways: start at the bottom left and fill our bits,1 with black, 0 is white. If the upper non-data area is encountered, it is bypassed or skipped.

Mask pattern

In this way, our diagram is filled out, but perhaps those points are not evenly balanced, if there is a large area of blank or black block, will tell us the difficulty of scanning identification. So, we have to do masking operations (by, or not complex) QR Spec said that QR has 8 mask you can use, as follows: in which, each mask formula is under each diagram. The so-called mask, plainly, is and the above generated diagram to do XOR operation. Mask will only be XOR with the data area and will not affect the ribbon. (Note: The choice of a suitable mask is also an algorithm)

The identification code for its mask is as follows: (the i,j corresponds to the x,y of the previous figure respectively)

The following are some of the mask, and we can see that some mask XOR data has become more fragmented.

The two-dimensional code after the mask is the final figure.

Here, two-dimensional code generation principle is finished, I hope you read this tutorial, you can write the program to generate two-dimensional code.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Detailed explanation of the principle and detail of two-dimensional code generation "multiple graphs"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Detailed explanation of the principle and detail of two-dimensional code generation "multiple graphs"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support