The processing method of character conversion in Chinese characters in JNI

Source: Internet
Author: User

Transferred from: HTTP://BLOG.SINA.COM.CN/FANGAOSJTU

These two days are learning to use JNI, in a Java program, a DLL that calls a mass dictionary. The use of JNI getstringchars function and newstring function, encountered the problem of Chinese garbled, tossing a night. Some information has been consulted and summarized as follows:

I. Related Concepts

    • Inside Java is the use of 16bit Unicode encoding (UTF-16) to represent strings, whether Chinese or English is 2 bytes;
    • The internal JNI is the use of UTF-8 encoding to represent strings, UTF-8 is a variable length encoding Unicode, the general ASCII character is 1 bytes, Chinese is 3 bytes;
    • C + + uses raw data, ASCII is a byte, Chinese is generally GB2312 encoded, with two bytes to represent a Chinese character.

Clear the concept, the operation is more clear. Below, according to the direction of the character stream to explain separately

1, Java--C + +

In this case, Java calls using a UTF-16 encoded string, the JVM will pass this string to jni,c/c++ to get input is jstring, this time, you can take advantage of the two types of JNI functions, one is Getstringutfchars, This function will get a UTF-8 encoded string, and the other is Getstringchars, which will get UTF-16 encoded string. Regardless of that function, the resulting string, if contained in Chinese, needs to be further converted into GB2312 encoding. As follows:

String
(UTF-16)
|
[Java] |
--------------------JNI Calls
[CPP] |
V
Jstring
(UTF-16)
|
+--------+---------+
| Getstringchars | Getstringutfchars
| |
V V
wchar_t* char*
(UTF_16) (UTF-8)

2, C + +--Java

JNI returns the string to Java, which should be the first responsibility to convert the string into a UTF-8 or UTF-16 format, and then encapsulate it as jstring by Newstringutf or newstring and return it to Java.

String
(UTF-16)
^
|
[Java] |
--------------------JNI return
[CPP] |
Jstring
(UTF-16)
^
|
+--------+---------+
^                  ^
| |
| newstring | Newstringutf
wchar_t* char*
(UTF_16) (UTF-8)

If the string does not contain Chinese characters, only the standard ASCII code, then the use of Getstringutfchars/newstringutf can be done, because in this case, the UTF-8 encoding and ASCII encoding is consistent, do not need to convert.

However, if there are Chinese characters in the string, then the encoding conversion in the C + + section is a must. We need two conversion functions, one is to turn the code of UTF8/16 into GB2312, and the other is to turn GB2312 into UTF8/16.

Here's a note: both Linux and Win32 support WCHAR, which is in fact a Unicode encoding UTF16 width of 16bit, so if we use the WCHAR type completely in our C/w + + program, this conversion is theoretically not required. In practice, however, it is not possible to replace char entirely with WCHAR, so the conversion is still necessary for most applications today.

Two. A conversion method

Use the wide char type to convert.

char* jstringtowindows (jnienv *env, jstring jstr)
{//UTF8/16 converted to gb2312
int length = (env)->getstringlength (JSTR);
Const jchar* JCSTR = (env)->getstringchars (jstr, 0);
char* RTN = (char*) malloc (length*2+1);
int size = 0;
Size = WideCharToMultiByte (CP_ACP, 0, (LPCWSTR) jcstr, Length, RTN, (length*2+1), NULL, NULL);
if (size <= 0)
return NULL;
(env)->releasestringchars (JSTR, JCSTR);
Rtn[size] = 0;
return RTN;
}

Jstring windowstojstring (jnienv* env, const char* STR)
{//gb2312 converted to UTF8/16
Jstring RTN = 0;
int slen = strlen (str);
unsigned short * buffer = 0;
if (Slen = = 0)
RTN = (env)->newstringutf (str);
Else
{
int length = MultiByteToWideChar (CP_ACP, 0, (LPCSTR) str, slen, NULL, 0);
Buffer = (unsigned short *) malloc (length*2 + 1);
if (MultiByteToWideChar (CP_ACP, 0, (LPCSTR) str, Slen, (LPWSTR) buffer, length) >0)
RTN = (env)->newstring ((jchar*) buffer, length);
}
if (buffer)
Free (buffer);
return RTN;
}

The processing method of character conversion in Chinese characters in JNI

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.