Deep understanding of string types in Java _java

Source: Internet
Author: User

1.Java built-in support for strings;
The so-called built-in support, that is, do not have to implement string types like C through a char pointer, and Java string encoding conforms to the Unicode encoding standard, which means that the C-language compatibility and Unicode standard is not implemented using string and wstring classes like C + +. Within Java, support for string types is implemented through the string class.
This means that we can call the same method directly on the string constants as we do with string objects:


You can call all methods of a string object directly on "ABC"
int length= "ABC". Length ();
And
String Abc=new string ("abc");
int Length=abc.length ();

The string value in 2.Java is constant (constant)

The meaning here is that the string type cannot change its value after the creation is complete, and it can be seen from the string's member method that there is no method interface that can change the value, and "abc", "Def" in "ABC", New String ("Def"). A constant pool stored in a Java virtual machine.

The following code, "ABC", is stored in a constant pool, so the address that the variable A,ab points to is the same "ABC" in the Constant pool.

Copy Code code as follows:

public class Stringtest {
public static void Main (string[] args) {
String a= "ABC";
String ab= "ABC";
String Abc=new string ("abc");
System.out.println (Ab==a);
System.out.println (A==ABC);
}
}
/* Program output:
* True
* False
* */

So how does a dynamically generated, variable string be implemented? Java provides StringBuffer and StringBuilder classes to implement this requirement; In Java, string concatenation can use the "+" operator; "ABC" + "Def" The internal implementation can also be implemented using the StringBuilder class or the StringBuffer class, so how is StringBuilder and StringBuffer internally implemented? is to store a string by a character array. Here's a snippet from the source code that came with the JDK to see that the StringBuffer internally uses a char array to store the string, where the Abstractstringbuilder is the StringBuffer parent class:

3. The encoding problem in the string.
Here are two questions to understand: How to handle string encoding in a source file? Compiled into a class file or code that is encoded in Java Virtual run-time strings?
The first problem is that the string encoding in the source code depends on your IDE or text editor. The following code is edited using the GBK encoding format and then opened using UTF-8 and GBK decoding
GBK encoding format, opening using GBK format

GBK encoding format, using UTF-8 format open, garbled; If at this time the system default encoding format is not GBK, you need to add the-encoding GBK parameter option value in the Javac at compile time;

So how do you deal with this source code coding problem? The answer is specified in the parameter option-encoding of the compiler javac, and the value of the default parameter is the same as the system default encoding. The default encoding for Windows is generally GBK (this value can be obtained by System.getproperty ("file.encoding"), the system defaults to GBK, but the source code uses UTF-8 encoding, and you should use Javac- Encoding UTF-8 for compilation.

What code is used to "compile into a class file or code that is encoded in a Java virtual run-time string?" The understanding of this problem is that, first, the string type in Java is implemented using the UTF-16 encoding, which means that strings in the Java Virtual machine are implemented using UTF-16 encoding regardless of the code in the source code. This means that as long as the compiler Javac correctly understand the encoding of strings in the source file, the string in the Run-time or class bytecode file is independent of the code format in the source code. Here we can further understand the char base type or the character class in Java, both of which are encoded in the same way as the Java string type, based on the UTF-16 encoding, that is, no matter ' a ', ' 1 ' characters or Chinese characters in Java are 16 bits in length.

Also, in string type, the conversion between the underlying binary representation and the string by specifying a fixed character encoding means that we can correctly read GBK encoding, UTF-8 encoding, or other encoded text files or other input streams to convert them to the correct string in memory.

As in the string class, there are the following methods:
public string (byte[] bytes, Charset Charset); Constructs a string by specifying a fixed character set encoding type, and the corresponding byte array (byte length 8 bits);
Public byte[] GetBytes (Charset Charset); Specifies the character set encoding type that converts the string to a byte array, which is the binary representation of the string.

You also need to note another member of the string method:

Public byte[] GetBytes (); The byte array returned by this method, based on the character set encoding that refers to the platform default character set encoding, not necessarily the UTF-16.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.