Java Puzzle characters (puzzle 18)

Source: Internet
Author: User

Puzzle 18: String cheese

The following program creates a string from a sequence of bytes and iterates through the characters in the string and prints them as numbers. Please describe the sequence of numbers printed by the program:

public class Stringcheese{public static void Mian (string[] args) {byte bytes[] new byte[256];for (int i = 0; i <; i++ ) {Bytes[i] = (byte) i;} String str = new String (bytes), for (int i = 0,n = Str.length (); i < n; i++) {System.out.print ((int) Str.charat (i) + "");} }}

Now, let's analyze the program. From this program, the byte array is initialized with each possible byte value from 0 to 255, and these byte values are converted to char values through the string constructor. Finally, the char value is transformed into an int value and printed. The printed value is definitely a non-negative integer because the char value is unsigned, so we may describe this program to print integers 0 through 255 in order. So is the description correct?

Obviously the description is incorrect, but after we run the program we may find that it is true that the sequence prints 0 to 255 of integers, why not? But if you run a few more times, you may see a different sequence of integers. The authors of the book, when tested on four different machines, get four different sequences, including the one described earlier. And this program does not even guarantee normal termination, and it is more insecure than printing any other particular string, and its behavior is indeterminate. The reason for this problem is the puzzle we are going to explore this time:

String (byte[]) constructor. This is the culprit that produced the puzzle. For its canonical description: "When constructing a new string by decoding a specified byte array using the platform default character set, the length of the new string is a function of the character set, so it may not be equal to the length of the byte array. When all bytes given are not all valid in the default character set, the behavior of the constructor is indeterminate ", which is a description of the constructor in the API. So what exactly is a character set? From a technical standpoint, the author says, it is "a combination of coded character set and character encoding patterns", which is also described in the API. In layman's terms, a character set is a package that contains a character, a numeric encoding that represents a character, and a way to and fro between a character encoding sequence and a sequence of bytes. The conversion pattern has a big difference between character sets: Some are one-to-many mappings between characters and bytes, but most of them are not. Iso-8859-1 is the only default character set that allows the program to print integers from 0 to 255 sequentially, and it is more commonly known as the name Latin-1 (iso-8859-1).

So by the way to introduce what is the Iso-8859-1 bar, the book is not detailed introduction:

ISO-8859-1, formally numbered ISO/IEC 8859-1:1998, also known as Latin-1 or "Western European language", is the first 8-bit character set of ISO/IEC 8859 within the International Organization for Standardization. It is based on ASCII and, in the range of vacant 0XA0~0XFF, adds 96 letters and symbols for use in the Latin alphabet language with additional symbols. This is also the Wikipedia of the basic introduction, want to learn more about the direct into Wikipedia:, we still go back to the discussion of the puzzle.

The default character set for the J2SE run-time environment (JRE) relies on the underlying operating system and language. If you want to know the default character set for your JRE, and you are using a 5.0 or later version, you can learn by calling Java.nio.charset.Charset.defaultCharset (), or, if you are using an earlier version, Then you can learn by reading the system Properties "file.encoding". Fortunately, there is no mandatory requirement to tolerate a variety of outlandish default character sets. When converting between a char sequence and a byte sequence, a character set can and should normally be specified. In addition to accepting a byte array, a string constructor that can accept a character set name is designed for this purpose. If you replace the string constructor in the original program with the following constructor, the program guarantees the ability to print integers from 0 to 255 in order, regardless of the default character set:

String str = new string (bytes, "iso-8859-1");

Declaring the constructor throws a unsupportedencodingexception exception, so it must be caught, or it is more appropriate to declare the main method to throw it, or the program cannot compile. However, the program does not actually throw an exception. CharSet's specification requires that each implementation of the Java Platform support certain kinds of character sets, Iso-8859-1 is ranked among them. So far, the lesson we can get from this is that whenever you want to convert a byte sequence into a string, you are using a character set, whether or not it is specified in the display. If you want your program to behave predictably, specify it explicitly each time you use a character set.

Java Puzzle characters (puzzle 18)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.