For Windows Notepad:
ANSI: GB2312 Java should use GBK decoding
Unicode: UTF-16 should be decoded in signed UTF-16LE java
Unicode big endian: UTF-16 should be decoded in signed UTF-16BE java
UTF-8: In signed UTF-8 java, you can only manually remove the signature and then use UTF-8 to decode it.
Ranch
For decoding of java programs:
GBK: GBK encoding is compatible with GB2312, so GBK is used to handle GBK and GB2312.
UTF-8: Unsigned UTF-8
UTF-16: Signed UTF-16LE or UTF-16BE, both of which are automatically identified based on the signature
UTF-16BE: Unsigned UTF-16BE
UTF-16LE: Unsigned UTF-16LE
Ranch
UNICODE is actually just a character set rather than an encoding. Windows Notepad uses Unicode to represent signed UTF-16LE.
Inappropriate, ANSI in Notepad will be mapped to different encodings in different systems, GB2312 encoding in Chinese systems, and English Departments.
ASCII encoding
Ranch
The signature is also called BOM, which is the byte order mark. It is inserted into the beginning of UTF-8, UTF16 or UTF-32 encoded Unicode files
Special tag to identify the encoding type of Unicode files. For UTF-8, BOM is not necessary, because BOM is used to
Marks the encoding type and byte order (big-endian or little-endian) of a multi-byte encoded file.
BOMs file header:
00 00 FE FF = UTF-32, big-endian
FF FE 00 00 = UTF-32, little-endian
EF BB BF = UTF-8,
FE FF = UTF-16, big-endian
FF FE = UTF-16, little-endian
Signature, BOM header, encoding, Windows Notepad encoding, Java encoding decoding those things