Java escape characters:
1. octal escape sequence: \ + 1 to 3 digits and 5 digits; range: '\ 000 '~ '\ 100'
\ 0: NULL characters
2. Unicode escape characters: \ U + four hexadecimal numbers; 0 ~ 65535
\ U0000: NULL Character
3. Special characters: Three
\ ": Double quotation marks
\ ': Single quotes
\: Backslash
4. control characters: 5
\ 'Single quotes
\ Backslash character
\ R press ENTER
\ N line feed
\ F go to paper form
\ T horizontal hop
\ B Return
Point escape:. ==> u002e
Escape of dollar signs: $ ==> u0024
Escape of the multiplication Symbol: ^ ==> u005e
Escape from the left braces: {==> u007b
Escape from left square brackets: [==> u005b
Escape of left parentheses :( ==> u0028
Escape of vertical bars: |=> u007c
Escape of right parentheses: ==> u0029
Asterisk escape: * ==> u002a
Escape of the plus sign: + ==> u002b
Escape question mark :? ==> U003f
Escape the backslash: ==> u005c
========================================================== ====================================
The following program uses two Unicode escape characters, which use their hexadecimal code to represent Unicode characters. So what will this program print?
Java code
Public class escaperout {
Public static void main (string [] ARGs ){
// \ U0022 is the Unicode Escape Character of double quotation marks
System. Out. println ("A \ u0022.length ()
+ \ U0022b ". Length ());
}
}
Public class escaperout {
Public static void main (string [] ARGs ){
// \ U0022 is the Unicode Escape Character of double quotation marks
System. Out. println ("A \ u0022.length ()
+ \ U0022b ". Length ());
}
}
A superficial analysis of the program will think that it should print out 26, because in the two double quotes "A \ u0022.length () + \ u0022b "indicates a string of a total of 26 characters.
A little deeper analysis will assume that the program should print 16 characters, because each of the two Unicode escape characters must be represented by six characters in the source file, but they only represent one character in the string. Therefore, this string should be 10 characters shorter than its appearance. If you run this program, you will find that this is far from the case. It neither prints 26 nor 16, but 2.
The key to understanding this puzzle is to know that Java does not provide any special processing for Unicode escape characters in string literal constants. Before parsing a program into symbols, the compiler converts Unicode escape characters into the characters they represent [JLS 3.2]. Therefore, the first Unicode Escape Character in the program is used as the ending quotation mark of a single character string literal constant (", the second Unicode escape character is used as the start quotation mark of another single character string literal constant ("B. The program prints the expression "a". Length () + "B". Length (), that is, 2.
If the author of the program really wants this behavior, the following statement will be much clearer:
Java code
System. Out. println ("A". Length () + "B". Length ());
It is more likely that the author wants to place two double quotation marks inside the string literal constant. You cannot use Unicode escape characters, but you can use the Escape Character Sequence to implement [JLS 3.10.6]. The Escape Character Sequence of a double quotation mark is a backslash followed by a double quotation mark (\"). If you replace the Unicode Escape Character in the original program with the Escape Character Sequence, it prints the expected 16 characters.(Error, it should be 14. I don't know how it will come out. 16):
Java code
System. Out. println ("A \". Length () + \ "B". Length ());
Many characters have corresponding escape character sequences, including single quotes (\ '), line breaks (\ n), tabs (\ t), and backslash (\\). You can use escape character sequences in character literal constants and string literal constants.
In fact, you can place any ASCII character in a string literal constant or a character literal constant by using a special series of escape characters called octal escape characters, however, it is best to use common escape character sequences as much as possible.
The common escape character sequences and octal escape characters are much better than the Unicode escape characters, because different from the Unicode escape characters, the escape character sequences are processed after the program is parsed into various symbols.
ASCII is the minimum common feature set of a character set. It contains only 128 characters, But Unicode contains more than 65,000 characters. A Unicode escape character can be used to insert a Unicode character in a program that only uses ASCII characters. A Unicode escape character is exactly equivalent to the character it represents.
Unicode escape characters are designed to be used when a programmer needs to insert a character that cannot be expressed in the source file character set. They are mainly used to place non-ASCII characters in identifiers, string literal constants, character literal constants, and comments. Occasionally, Unicode escape characters are also used to explicitly identify one of several seemingly similar characters to increase the definition of the program.
In short, the escape character sequence should be preferred in character strings and literal constants, rather than Unicode escape characters. Unicode escape characters may be confusing because they are processed too early in the compilation sequence. Do not use Unicode escape characters to represent ASCII characters. In character strings and character literal constants, escape character sequences should be used. In addition to these literal constants, ASCII characters should be directly inserted into the source file.