Background: When doing Java development, there are often garbled, or not correctly recognized or read files, such as common validator authentication with the message resource (properties) file needs to be Unicode re-encoding. The reason is that Java's default encoding is Unicode, and our computer system code is often encoded as GBK. The need to convert the system's encoding to Java correctly recognized encoding problem is solved. 1, Native2ascii Introduction: NATIVE2ASCII is a tool provided by the Sun Java SDK. Used to encode other text class files (such as *.txt,*.ini,*.properties,*.java, and so on) to Unicode encoding. Why transcoding is due to the internationalization of the program. Definition of Unicode Encoding: Unicode (Uniform Code, universal Code, single code) is a character encoding used on a computer. It sets a uniform and unique binary encoding for each character in each language to meet the requirements for cross-language, cross-platform text conversion, and processing. Research and development began in 1990, officially announced in 1994. With the increased ability of computers to work, Unicode has gained popularity in the more than 10 years since its debut. (Disclaimer: The Unicode encoding definition is from the Internet.) 2, get NATIVE2ASCII: After installing the JDK, if you are installed on Windows, then in the JDK installation directory, there will be a bin directory, where Native2ascii.exe is exactly. 3. Native2ascii command-line naming format: Native2ascii-[options] [inputfile [OutputFile]] Description -[options]: Represents a command switch with two options to choose from -reverse: Converts the Unicode encoding to local or to the specified encoding, without specifying the encoding, and will be converted to local encoding. -encoding Encoding_name: Converts to the specified encoding, Encoding_name is the encoded name. -[inputfile [OutputFile]] Inputfile: Indicates the full name of the input file. OutputFile: Output file name. If this parameter is missing, it is output to the console. 4. Best practice: First add the bin directory of the JDK to the system variable path. Create a test directory under the disk, create a zh.txt file in the test directory, the file content is: "Lava", open "command line Prompt", and enter the C:\test directory. Below you can follow the instructions step by step to observe the coding changes. A: Convert Zh.txt to Unicode encoding, output file to U.txt Native2ascii Zh.txt U.txt Open the U.txt with the content "\U7194\U5CA9". B: Convert Zh.txt to Unicode encoding, output to console C:\test>native2ascii Zh.txt \u7194\u5ca9 As you can see, the console outputs "\U7194\U5CA9". C: Convert Zh.txt to iso8859-1 encoding, output file to I.txt Native2ascii-encoding iso8859-1 zh.txt I.txt Open the I.txt file with the content "\u00c8\u00db\u00d1\u00d2". D: Convert u.txt to local encoding, output to file U_nv.txt Native2ascii-reverse U.txt U_nv.txt Opens the U_nv.txt file, with the content "lava". E: Convert u.txt to local encoding, output to console C:\test>native2ascii-reverse U.txt Lava As you can see, the console outputs "lava". F: Convert i.txt to local encoding, output to I_nv.txt Native2ascii-reverse I.txt I_nv.txt Open the I_nv.txt file with the content "\u00c8\u00db\u00d1\u00d2". Discover the exact same before and after transcoding. In other words, there is no turn, or the mind is confused, the name is not understood. G: Convert I.txt to GBK encoding, output to I_gbk.txt Native2ascii-reverse-encoding GBK i.txt I_gbk.txt Open the I_gbk.txt file with the content "\u00c8\u00db\u00d1\u00d2". Discover the exact same before and after transcoding. In other words, there is no turn, or a confused mind, no understanding of the naming. H: transcode u_nv.txt to local code GBK, output to console C:\test>native2ascii-reverse-encoding iso8859-1 I.txt Lava From this result, the target reached, the code i.txt to Iso8859-1, to the local code after the content of "lava." It should be realized from here that the encoding specified in the Native2ascii-reverse command is the encoded format of the source file-encoding. In the NATIVE2ASCII command,-encoding specifies the encoding format for the (generated) target file. This is a very important point! Remember!! Continue exploring, new file 12a.txt, content "12axyz". Look at the encoding of pure alphanumeric numbers. I: Convert pure alphanumeric text file 12a.txt to Unicode encoding Native2ascii 12a.txt 12a_nv.txt Open a 12a_nv.txt file with the content "12axyz". Continue the test and switch to Iso8859-1 code. C:\test>native2ascii-encoding iso8859-1 12a.txt 12axyz The result is still not transcoded. From the result can be concluded: for pure numbers and letters of text type pieces, the content before and after transcoding is the same. 5, Summary: Native2ascii is a very good tool to improve the code, and transcoding is reversible! And its true meaning is not the local code--transcoding to ASCII code, but a common text file encoding conversion tool. In the case of encoding conversion, there are two types of specified encoding, namely output file encoding and input file encoding, in particular, to see the Best Practices section. |