Although there has been a considerable amount of discussion about Java Chinese issues, there are no official standards for Java-facing Web servers, application servers, and JDBC database drivers due to the wide range of technical standards available to them. So the problems that Java applications have in dealing with Chinese are not only disappearing, but also changing with different factors such as the server, driver, and operating environment that are selected. So how do we find out the problem in many phenomena and analyze and solve it? In contrast to most of the discussions, this article will focus on how to predict, discover and examine problems in the perspective of suggestions to help developers identify possible sources of problems, so as to better solve Java's Chinese problems.
Introduction
Although there is a number of discussions on Java Chinese processing, there are no official standards for Java technology, because of its wide range of content (more than 10 related technologies), a wide variety of technology vendors, Java-oriented Web servers, application servers, and JDBC database drivers. So the Java application in the process of processing Chinese in the existing problems, but also with the choice of servers, drivers of different Java Chinese problems caused by the variability, increased the complexity of the problem. So how can we find the crux of the problem in so numerous phenomena?
General solutions to Java Chinese problems
In fact, Java's Chinese problems are due to the default encoding format used by Java applications that differs from the target or the encoding format in which the application reads characters (see document 1). There are usually four ways to solve Java's Chinese problems:
1 Select the Chinese localized version of the JDK. Although the Chinese localized version of the Java2 JDK (http://java.sun.com/products/jdk/1.2/ chinesejdk.html) is not an official version, and Sun does not promise to upgrade the localized version, but it is still a solution to the Java Chinese problem.
2 Select the appropriate compilation parameters. For the international version of Java, we can also compile Java applications by specifying a defined encoding mechanism to support the results of their compilation in Chinese. For example, the source program can be compiled by javac-encoding Big5 Sourcefile.java and javac-encoding gb2312 Sourcefile.java to support Traditional Chinese and Simplified Chinese applications.
3 The conversion code of the character code is realized by the way of programming. It has become a common practice to solve Java's Chinese problems programmatically. The following is one of the most common character encoding conversion functions, which converts the encoded format of characters into the GBK encoded form of the Chinese Windows system.
public static String toChinese(String strvalue)
{
try{
if(strvalue==null)
return null;
else
{
strvalue=new String(strvalue.getBytes("ISO8859_1"), "GBK");
return strvalue;
}
}catch(Exception e){
return null;
}
}
4) Define the character output set. For JSP applications, we can define the character output set of JSP pages by or to. Of course, we can also define the output set of characters through HTML tags.
The problems that exist
Depending on how the method is implemented, we can divide the above four methods into two categories, one is that by using some standard or rule to implement the method, above 1, 2, 4) All belong to this class, one kind is through the specific programming to implement the method, the method mentioned above 3) belongs to this class.
Because Method 1, 2), 4 is a kind of normative method, so the method is relatively simple, the solution does not have a larger pertinence, more general, for example, we can use method 2 to compile Java source files to achieve the preset of the inner code, Regardless of what part of the source code in the end there is a Java Chinese processing problems, such as output garbled and so on.
However, because these methods are not targeted, the solution to the problem is too uniform, so in some cases they do not completely solve the Java Chinese problem. Give a very common example. In general, users ' Java applications often need to interact with other Java application interfaces, such as accessing a database through some version of JDBC. Because the code that the JDBC driver supports varies with the provider and even the version, so if in the database input and output process in Chinese can not correctly handle the problem, we need in the data input and output process to do two times the exact opposite of the encoding conversion, which for Method 1, 2, 4), are often impossible to solve. Of course, for Method 2, we can also use some tricks to meet the above situation, one of the most effective way is to try to make the Java application of the various parts of the component. For example, we can compile the input and output code of a database into different source files to meet different character encoding requirements. But the usual programming is unlikely to meet this requirement, because the result of this procedure is likely to be unreasonable. For example, it is a more appropriate design to encapsulate the read and write methods of a database into a class, but it would be very unreasonable to implement the two methods of the class in two files respectively. So for 1, 2), 4 the method, although the implementation is relatively simple, but has some insurmountable shortcomings. This is also the reason that the relatively complex programming approach is popular.
As opposed to Method 1, 2), 4), Method 3 has better pertinence and flexibility. Procedures can be flexible to deal with different situations, the character encoding is converted in any place, but the characteristics of the method also require a higher demand for the software developer--must be able to accurately capture where the Chinese problem is likely to occur, and make the right judgments and treatments.