The previous section describes the wrapper class character for a single character, and this section describes the string class. String manipulation is probably the most common operation in a computer program, and the class that represents the string in Java is string, and this section describes string in detail.
The basic use of strings is simpler and more straightforward, let's take a look.
Basic usage
You can define a string variable by a constant
String name = "Old horse says programming";
You can also create a string from new
New String ("Lao Ma says programming");
String can use the + and + = operators directly, such as:
String name = "Old horse"; name+ = "Say programming" = "Explore the nature of programming"; SYSTEM.OUT.PRINTLN (Name
The output is: Lao Ma says programming, exploring the nature of programming
The string class includes a number of methods to facilitate the manipulation of strings.
Determines whether a string is empty
Public boolean isEmpty ()
Get string length
Public int Length ()
Take a substring
Public String substring (int beginindex) public string substring (intint
Finds a character or substring in a string, returns the first found index position, not found return-1
Public int indexOf (int ch)publicint indexOf (String str)
Look for a character or substring from the back, return the first index from the next number, no return-1
Public int lastIndexOf (int ch)publicint
Determines whether the string contains the specified sequence of characters. Recalling that Charsequence is an interface, string also implements the Charsequence
Public boolean contains (charsequence s)
Determines whether a string begins with a given substring
Public boolean startsWith (String prefix)
Determines whether a string ends with a given substring
Public boolean endsWith (String suffix)
Compare with other strings to see if the content is the same
Public boolean equals (Object anobject)
Ignore case, compare with other strings to see if content is the same
Public boolean equalsignorecase (String anotherstring)
String also implements the comparable interface, which can compare string sizes
Public int compareTo (String anotherstring)
You can also ignore case, make size comparisons
Public int comparetoignorecase (String str)
Convert all characters to uppercase character, return new string, original string unchanged
Public String toUpperCase ()
Convert all characters to lowercase character, return new string, original string unchanged
Public String toLowerCase ()
String concatenated, returns the current string and the string after the argument string is merged, the original string is unchanged
Public String concat (String str)
String substitution, replacing a single character, returning a new string, the original string unchanged
Public String replace (charchar Newchar)
String substitution, replacing a sequence of characters, returning a new string, the original string unchanged
Public
Delete the opening and closing spaces, return the new string, the original string unchanged
Public
Separates strings, returns an array of delimited substrings, the original string unchanged
Public String[] Split (String regex)
For example, separate "Hello,world" by commas:
String str = "Hello,world"= Str.split (",");
Arr[0] is "Hello", arr[1] is "world".
Understanding the basic usage of string from the caller's point of view, let's take a closer look at the inside of the string.
into string interior
Encapsulate character Array
Inside the string class, a character array is used to represent the string, and the instance variable is defined as:
Private Final Char value[];
String has two constructor methods that can create a string based on a char array
Public String (char value[]) public string (charintint count)
It is necessary to note that the string creates a new array based on the arguments and copies the contents without directly using the character array in the argument.
Most of the methods in string, inside are also the character array of the operation. For example:
- The length () method returns the lengths of the array
- The substring () method is to invoke the constructor method string (char value[], int offset, int count) to create a new string based on the argument
- IndexOf find a character or substring is found in this array
The implementation of these methods is mostly direct, we will not repeat.
There are also methods in string that are related to this char array:
Returns the char at the specified index position
Public Char charAt (int index)
Returns a char array corresponding to a string
Public Char [] ToCharArray ()
Notice that a copy of the array is returned, not the original array.
Copies the characters of a specified range in a char array into the destination array at the specified position
Public void getChars (intintchar int
Handling characters by Code point
Similar to character, the string class also provides methods for handling strings at code point.
Public int codepointat (int index)publicint codepointbefore (int Index)publicint codepointcount (intint endIndex) publicint offsetbycodepoints (intint codepointoffset)
These methods are very similar to what we described in the Anatomy Character section, which we will not dwell on in this section.
Encoding Conversion
Inside a string is a utf-16be processing character, a BMP character, using a char, two bytes, and two char, four bytes for the supplementary character. We have introduced the various encodings in the sixth section, which may be used for different character sets, using different byte numbers, and different binary representations. How do you deal with these different encodings? How do these encodings translate to each other in the Java internal representation?
Java uses the CharSet class to represent a variety of encodings, which have two common static methods:
Public Static Charset defaultcharset () Public Static
The first method returns the default encoding for the system, for example, on my computer, executing the following statement:
System.out.println (Charset.defaultcharset (). name ());
Output is UTF-8
The second method returns the CharSet object given the encoded name, corresponding to the encoding we introduced in section sixth, whose charset name can be: Us-ascii, iso-8859-1, windows-1252, GB2312, GBK, GB18030, Big5, UTF-8, such as:
Charset Charset = Charset.forname ("GB18030");
The string class provides the following method, which returns a string that is represented by a given encoded byte:
Public byte [] getBytes () Public byte [] getBytes (String charsetname) Public byte
The first method has no encoding parameters, uses the system default encoding, the second method parameter is the encoded name, and the third is charset.
The string class has the following construction method, which allows you to create a string based on bytes and encodings, that is, to create an internal representation of Java based on the byte representation of a given encoding.
PublicString (bytebytes[]) PublicString (byteBytes[],intOffsetintlength) PublicString (byteBytes[],intOffsetintlength, String charsetname) PublicString (byteBytes[],intOffsetintlength, Charset Charset) PublicString (bytebytes[], String charsetname) PublicString (byteBytes[], Charset Charset)
In addition to the encoding conversion through the methods in string, there are some methods for encoding/decoding in the CharSet class, which is not covered in this section. It is important to realize that the internal representations of Java are different from the various encodings, but can be converted to each other.
Non-denaturing
Like the wrapper class, the String class is also immutable, meaning that once an object is created, there is no way to modify it. The string class is also declared for final, cannot be inherited, and the internal Char array value is final, and cannot be changed after initialization.
Many of the seemingly modified methods are provided in the string class, actually by creating a new string object, and the original string object is not modified. For example, let's look at the code for the Concat () method:
Public string concat (String str) { int otherlen = str.length (); if (Otherlen = = 0) {returnthis; } int len = value.length; char buf[] = arrays.copyof (value, Len + otherlen); Str.getchars (buf, Len); return New true );}
A new character array was created by the Arrays.copyof method, the original content was copied, and a new string was created by new. For the arrays class, we'll cover it in more detail in subsequent chapters.
Similar to wrapper classes, defined as immutable classes, programs can be simpler, more secure, and easier to understand. However, if you modify the string frequently and create a new string for each modification, the performance is too low, you should consider the other two classes StringBuilder and StringBuffer in Java, which we'll cover in the next section.
Constant string
String constants in Java are very special, except that they can be directly assigned to a string variable, and it itself is like a string object, and can call the various methods of string directly . Let's look at the code:
System.out.println ("Lao Ma says programming". Length ()); System.out.println ("Lao Ma says Programming". Contains ("Old Horse")); System.out.println ("Lao Ma says programming". INDEXOF ("Programming"));
In fact, these constants are types of string objects, in memory, they are placed in a shared place, this place is called the string constant pool, it holds all the constant string, each constant will only save one copy, is shared by all users. when a string is used in the form of a constant, the corresponding object of type string in the constant pool is used.
For example, let's look at the code:
String name1 = "Lao ma says programming"= "old horse says programming"; System.out.println (name1==name2);
The output is true, why? It can be thought that "Lao Ma says programming" has a corresponding string type object in the constant pool, we assume the name is Laoma, the above code is actually similar to:
New String (newchar[]{' old ', ' horse ', ' say ', ' edit ', ' process '== laoma; System.out.println (name1==name2);
There is actually only one string object, and three variables point to this object, and name1==name2 is self-evident.
It is important to note that if you do not assign a value directly through a constant, but instead create it through new, = = does not return True , see the following code:
New String ("Lao Ma says programming"new String ("Lao Ma says programming"); System.out.println (name1==name2);
The output is false, why? The above code looks like this:
New String (newchar[]{' old ', ' horse ', ' say ', ' edit ', ' process 'newnew String ( Laoma); System.out.println (name1==name2);
The constructor code for string arguments in the string class is as follows:
Public string (string original) { this. Value = original.value; this. hash = Original.hash;}
Hash is another instance variable in the String class that represents the cached Hashcode value, which we'll cover later.
As you can see, name1 and Name2 point to two different string objects, except that the value values inside the two objects point to the same char array. Its memory layout is probably as follows:
So, name1==name2 is not tenable, but Name1.equals (name2) is true.
Hashcode
We just mentioned hash as the instance variable, which is defined as follows:
Private int // Default to 0
It caches the value of the Hashcode () method, that is, when the first call to Hashcode () is made, the result is stored in the hash variable, and the saved value is returned directly after the call.
Let's look at the Hashcode method of the string class with the following code:
Public int hashcode () { int h = hash; if (h = = 0 && value.length > 0) {char val[] = value ; for (int i = 0; i < value.length; i++) { = + * H + val[i]; } = h; } return h;}
If the cached hash is not 0, it is returned directly, otherwise the hash is calculated based on the contents of the character array:
s[0]*31^ (n-1) + s[1]*31^ (n-2) + ... + s[n-1]
s represents a string, S[0] represents the first character, n represents the length of the string, and s[0]*31^ (N-1) represents the n-1 of 31 multiplied by the value of the first character.
Why use this method of calculation? In this equation, the hash value is related to the value of each character, and each position is multiplied by a different value, and the hash value is also related to the position of each character. The use of 31 is probably because of two reasons, on the one hand can produce more scattered hash, that is, different string hash value is also generally different, on the other hand, the computational efficiency is higher, 31*h and 32*h-h namely (H<<5)-h equivalent, you can use more efficient shift and subtraction operation instead of multiplication operation.
In Java, the general use of the above ideas to achieve hashcode.
Regular expressions
In the string class, there are methods that accept not ordinary string arguments, but regular expressions, and what are regular expressions? It can be understood as a string, but the expression is a rule, generally used for text matching, find, replace, and so on, regular expression has a rich and powerful function, is a relatively large topic, we will be introduced separately in the following chapters.
There are specialized classes such as pattern and matcher for regular expressions in Java, but for simple cases, the string class provides a more concise operation, and the methods for accepting regular expressions in string are:
Delimited string
Public
Check if matches
Public Boolean matches (String regex)
String substitution
Public string Replacefirst (string regex, string replacement) Public
Summary
In this section, we introduce the String class, introduce its basic usage, internal implementation, code conversion, analyze its immutability, constant string, and hashcode implementation.
In this section, we mention that in the frequent string modification operations, the string class is less efficient, and we mention the StringBuilder and StringBuffer classes. We also see that the string can be manipulated directly using + and + =, and behind them is the StringBuilder class.
Let's take a look at these two classes in the next section.
----------------
To be continued, check out the latest articles, please pay attention to the public number "old Horse Programming" (Scan the QR code below), from the introduction to advanced, in layman's words, Lao Ma and you explore the nature of Java programming and computer technology. Write attentively, original articles, and keep all copyrights.
-----------
Related high Praise original article
Computer Program Thinking Logic (6)-How to recover from garbled characters (top)?
Thinking Logic of computer program (7)-How to recover from garbled characters (bottom)?
Logic of the computer program (8)-the true meaning of char
Thinking Logic of computer program (28)-Profiling wrapper class (bottom)
Thinking Logic of computer programs (29)-parsing string