When Java reads a string from a byte stream, it converts platform-related bytes into platform-independent Unicode strings. During output, Java will convert the Unicode string to the byte stream related to the platform. If a Unicode character does not exist on a platform, ′? ′. For example, in Japanese windows, Java reads a "shift_jis" encoded file (which can be any stream) to the memory to construct a String object, the "shift_jis" encoded text will be converted to a unicode encoded string. If this string is output, the Unicode string will be converted to a byte stream or array of "shift_jis: "New rule" -----> "/u65b0/u898f/u4f5c/u6210" -----> "new rule ". Since Java 2 can only process ascii-based Attribute files, all Unicode-encoded files must be converted using the ASCII escape code. For example, if you use Notepad, emedit, or editplus to open the properties file, you will see a bunch of ASCII escape strings:
# -- Application --
MSG. Common. Complete. Process = {0}/u304c/u5b8c/u4e86/u3057/u307e/u3057/u305f/u3002
With streamreader, No matter what encoding is used to read the file, what you get is something like a password.
Let's take a look at the comparison of the following two statements:
Console. writeline ("1 >>{ 0}", @ "errors. token =/u753b/u9762/u8868/u793a/u9806/u304c/u4e0d/u6b63/u3067/u3059/u3002 ");
Console. writeline ("2 >>{ 0}", "errors. token =/u753b/u9762/u8868/u793a/u9806/u304c/u4e0d/u6b63/u3067/u3059/u3002 ");
Output result:
1> errors. Token =/u753b/u9762/u8868/u793a/u9806/u304c/u4e0d/u6b63/u3067/u3059/u3002
2> errors. Token = the image indicates that the token is incorrect.
That is, when reading from a file, it is the same as output 1, No matter what encoding you use...
So what we need to do is to convert the "/u753b" string into a "image", which seems to be deciphering the password. In a Japanese environment, Japanese characters are dubyte encoded and combined into high and low bytes.
[Stathread]
Static void main (string [] ARGs)
{
String strfilename = "E: // Visual Studio project // messageresources. properties ";
Console. writeline (getpropertiestext (strfilename ));
Console. Read ();
}
Public static string getpropertiestext (string strfilename)
{
Stringbuilder sb = new stringbuilder ();
Using (streamreader sr = new streamreader (strfilename ))
{
String STR = NULL;
While (STR = Sr. Readline ())! = NULL)
{
RegEx = new RegEx ("(// U [A-Fa-f0-9] {4 })");
If (RegEx. ismatch (STR ))
{
Foreach (Match m in RegEx. Matches (STR ))
{
STR = Str. replace (M. result ("$1"), getunicodestring (M. result ("$1 "). replace ("// U ","")));
}
SB. append (STR );
}
Else
{
SB. append (STR );
}
SB. append ("/N ");
}
}
Return sb. tostring ();
}
Public static string getunicodestring (string strinput)
{
Try
{
Byte [] array = new byte [2];
String STR = strinput;
String S1 = Str. substring (0, 2 );
String S2 = Str. substring (2 );
Array [0] = convert. tobyte (S1, 16 );
Array [1] = convert. tobyte (S2, 16 );
Return System. Text. encoding. bigendianunicode. getstring (array );
}
Catch (exception)
{
Return strinput;
}
}