How to properly set up Informix GLS and CSDK locales

Source: Internet
Author: User
Tags informix locale



This article describes GLS related knowledge, explains how to correctly set up the Informix GLS language environment related variables (Db_locale,client_locale), to ensure that the Informix database server, the client can correctly support Chinese characters and support the use of the object name in English. Describes the requirements for locale settings in CSDK 2.7 or later (currently the latest version is CSDK 3.5). Examples of common errors and workarounds for locale settings are provided.





Overview


IBM Informix products can support many languages, cultures, and code sets. All culture-specific information is aggregated in a single environment, called The Global Language Support (GLS) language environment. In addition to ASCII American English, GLS allows you to work in other locales and use non-ASCII characters in SQL data and identities. You can use the GLS feature to align with a specific locale customization. Locale files include culture-specific information, such as currency and date formats, and collation order.



This article introduces GLS related knowledge, explains how to correctly set up the Informix GLS language environment related variables (Db_locale, Client_locale), to ensure that the Informix database server, the client can correctly support Chinese characters and object names that support the use of English. And a description of the requirements for locale settings in CSDK 2.7 or later (currently the latest version is CSDK 3.5).





GLS Basic Concepts


Characters (Character) are all kinds of words and symbols, including the national text, punctuation, graphic symbols, numbers and so on. The character set (Character set) is a collection of multiple characters, with a variety of character sets, each with a different number of characters, common character set names: ASCII character set, GB2312 character set (Simplified Chinese), BIG5 character set (Traditional Chinese), GB18030 Character Set (Asian character set), Unicode (Common UTF-8) character set, and so on.



The Informix GLS language environment names and internally encodes the commonly used character sets (with 16-encoding) management. Through server-side files: $InformixDIR/gls/cm3/registry to view GLS character names, encoding tables. Examples are as follows:


Character Set  name  encoding 8859-1 819 # 0x0333 57357 # 0xe00d GB2312-80 57357 # 0xe00d   57372 # 0xe01c 57352 # 0xe008 GB18030-2000 5488 # 0x1570


Different character set names in the GLS environment may correspond to the same character set encoding, but a character set can only have one encoding, which means that the character set encoding is unique.



The supported character sets are categorized into different directories by language and region in the GLS environment. $InformixDIR/gls/lc11/Language _ region/, such as the Chinese mainland area of the directory is: $InformixDIR/gls/lc11/zh_cn/, the directory has the following two files: 1570.lco E00d.lco, indicating that we are setting the character set , we can use ZH_CN. gb18030-2000 ZH_CN.GB ZH_CN. GB2312-80 three different names. Here (ZH_CN.GB and ZH_CN. Gb2312-80 corresponds to the same character set).



The different character sets in the GLS environment can be converted correctly, looking at the ways in which the character sets can be converted correctly, and see if there are files in the directory $InformixDIR the/GLS/CV9 directory that have the specified character set converted to each other. If there are files E01CE00D.CVO and E00DE01C.CVO two files in this directory, it means that GLS supports character conversion between UTF-8 and GB through these two transform files.



Informix sets the language localization support settings for the database through Db_locale and Client_locale. The values of Db_locale and Client_locale are made up of four parts (part 4th is optional) and the character set is case insensitive.


    1          2            3          4  < language >_< Country and region >.< character set name/character set encoding >[@modifier]


To illustrate:


Client_locale=en_us.8859-1 client_locale=en_us.819 # above two for the same character set:819 for 8859-1 encoding db_ LOCALE=ZH_CN.GB




How the GLS character set works


The Informix database server-side, client character set works as shown in Figure 1.


Figure 1. IDS GLS Character Set processing process




Db_locale environment variable Use
    1. When the client application and the database server Exchange character data, the client application performs a code set conversion if the value of the Db_locale environment variable (on the client computer) differs from the value of Client_locale. Code set conversions prevent data corruption at the same time between these two sets of code.
    2. When a client application requests a connection, it sends information that includes Db_locale (if it is set) to the database server.
    3. The database server uses Db_locale when determining how to set database information for the server to process the language environment.
    4. When a client application attempts to open a database, the database server compares the value of the Db_locale environment variable passed by the client application with the database locale stored in the database.
    5. When a database server accesses a column of a data type that is related to a locale, the database server uses the locale specified by Db_locale.
    6. When the database server creates a new database, it examines the database language environment (Db_locale) to determine how character information is stored in the system directory of the database. This information includes actions such as how to handle regular expressions, compare strings, and ensure proper use of the code set.
Client_locale environment variable Use
    1. When the client application and the database server Exchange character data, the client application performs a code set conversion if the code set of the Client_locale environment variable differs from the code set of Db_locale (on the client computer). Code set conversions prevent data corruption at the same time between these two sets of code.
    2. When a client application requests a connection, it sends information that includes client_locale to the database server.
    3. The database server uses Client_locale when determining how to set the client application information for the server to process the language environment.
    4. When the preprocessor for Informix ESQL/C processes the source file, it accepts C source code written in Client_locale's code set. When the Informix ESQL/C client application executes, the Client_locale is checked for the name of the client locale, which affects the operating system file name, the content of the text file, and the format of the date, time, and number data.
    5. When you create a file in the database utility, the filename and file contents are in the code set specified by Client_locale. When you look for a product-specific message file, the client application examines the message directory associated with the client language environment.
The meaning of the four language environments
    1. Client language Environment-client locale

      The client locale specifies the language, geography, and code set that the client application uses to perform read and write (I/O) operations. In a client application, I/O operations include reading keyboard input or data files to be sent to a database, and writing data retrieved from the database server to a screen, file, or printer. Set up the client locale through Client_locale.

    2. Database language Environment-database locale

      Specify the language, geography, and code set required by the database server to correctly interpret the data types (NCHAR and NVARCHAR) that are relevant to the locale in a particular database by using the DB_LOCALE environment variable setting database locale. The code set specified in Db_locale determines which characters are valid for any word columns, and determines the names of database objects such as databases, tables, columns, and views. The database server uses the database code set specified by the Db_locale environment variable to pass data to and from the database.

    3. Server language Environment-server locale

      The database server writes files (such as Debug and warning files) using the server code set specified by the Server_locale environment variable. However, the database server does not use the server locale to write files (databases and table files) in the Informix-specific format.

    4. Server processing Language Environment-server processing locale

      The database server uses the code set of the database locale as the code set for the server processing locale, using the server processing locale to write files in the Informix-specific format (database and table files). This means that the database server uses the database locale (Db_locale) to write files in the Informix-specific format (database and table files).

Establish the database connection process


When the client application requests a connection to the database, the database server uses the GLS locale to perform the following steps.


    1. The client application sends the locale information to the database server.
      • Client_locale (default en_us.819 is not set);
      • Db_locale (not sent if not set).
    2. Verifies that a connection can be established between the client application and the database it is requesting.

      Compare the following two locales:

      • The locale specified by the Db_locale sent by the client application;
      • The database locale stored in the system directory of the requesting database.
        • Match, the connection is established.
        • Does not match, the prompt cannot connect to the database. Or you can continue to make such a connection, but the database server might interpret the data it received from the client incorrectly, so you can only rely on yourself to understand the format of the data in the interchange.
    3. Determine the server processing locale, in order to determine the server processing locale as follows:
      • Use client-defined Db_locale;
      • environment variable Db_locale in a database locale.
Performing Code Set conversions


In a client/server environment, if a client or server computer uses a different set of code to represent the same characters, then the character data needs to be converted from one code set to another. Without code set conversions, one computer cannot correctly process or display character data originating from another computer (when the two computers use different sets of code).



When to perform code set conversions



An application must use code set conversions only if the two code sets (client and server processing locales, or server processing locales and servers) are not the same. The following scenarios are possible reasons for different code sets:


    1. Different operating systems may encode the same character in different ways.
    2. If the client locale and the database locale specify a different code set, the client application performs a code set conversion so that the server computer does not mount this type of processing.
    3. If the server locale and server processing locale specify different sets of code, the database server performs code set conversions when writing and reading operating system files, such as log files. This conversion does not involve database data issues.


In Figure 1, a black dot represents two moments in a client/server environment where code set conversions can occur.



Client application code Set conversions



The client application automatically performs code set conversions between the client and the database code set when the following two conditions are true:


    • The code set for the client and database locale does not match.
    • There are valid target code set conversions between the client and the database code set.


When the client application starts executing, it compares the name of the client and database locale to determine whether to perform a code set conversion. If the Client_locale and Db_locale environment variables are set, then the client application uses these locale names to determine the set of code for the client and the database, respectively. If Client_locale is not set (and Dbnls is not set), the client application assumes that the client locale is the default locale. If Db_locale is not set (and Dbnls is not set), the client application assumes that the database locale is the same as the client locale (the value set by Client_locale).



If the client and database code sets are the same, no code set conversions are required. However, if the code set does not match, the client application must determine whether the two code sets are convertible. If the client can find the associated code set transformation file, then two code sets are convertible. These code set transformation files must exist on the client computer.



To illustrate:


Client applications: client_locale=en_us.1252 db_locale=en_us.8859-11252(in the client language environment) and ISO8859-1 1252 and Iso8859-1, instead of converting between Windows code pages 1252 and ZH_CN.GB. This situation can lead to data corruption. The application will not continue with this link. 




Set character sets


Informix sets the language localization support settings for the database through Db_locale and Client_locale.


Database service Side


When you create a database (in order to unify the character set of the system database and the application database when you create the DB instance), set the Db_locale value for the database as follows.


1. Set the environment variable Db_locale  set Db_locale=zh_cn.gb 2. Creating a Database Create  databases dbname3 . Verifying the current database character set SELECT dbs_collate  from Sysmaster:sysdbslocale = ' dbname '
Client


When we use ODBC,JDBC to connect to the database, we need to set the locale variables correctly in the connection information: Db_locale and Client_locale.


set locale variable Db_locale=zh_cn.gb  client_locale=ZH_CN.GB


Odbc:



Set the ODBC locale for the WINDOWS environment.


Figure 2. Setting up locales in ODBC





The UNIX environment needs to be defined in the Odbc.ini file:


the following two database language environment variables are defined in the Odbc.ini file Db_locale=zh_cn.gb  client_locale=ZH_CN.GB


Jdbc:



When using JDBC to connect to a database, we need to set the database locale variables in the URL of the connected database: Db_locale and Client_locale. Examples are as follows:


String url = "jdbc:informix-sqli://10.127.1.11:8001/testdb:informixserver=servername;user=user;password=  Password Db_locale=ZH_CN.GB; CLIENT_LOCALE=ZH_CN.GB ";




Common Character Set setup issues


In the Informix database character set and use process, we often encounter some character set related errors, know the cause of the error, it is easy to solve the problem. Here we summarize some of the common character set settings related issues.


ERROR-23101 Unable to load locale categories


This error occurs when the following file that corresponds to the character set of Db_locale and Client_locale is not present.


    • -$InformixDIR/gls/lc11/db_locale's (language _ Region)/(16 binary encoding of DB). Lco
    • -$InformixDIR/gls/lc11/client_locale's (language _ Region)/(16 binary encoding of DB). Lco
    • -$InformixDIR/gls/lc11/client_locale's (language _ Region)/(16 binary encoding of the CLIENT). Lco


To illustrate:


Db_locale = En_us.utf8 # (correspondingto zh_cn.gb18030-2000 # (corresponding 16 encoding: 15703 files must exist, the missing arbitrary file will be reported error-23101.  -$InformixDIR/gls/lc11/en_us/ E01c.lco -$InformixDIR/gls/lc11/zh_cn/ E01c.lco -$InformixDIR/gls/lc11/zh_cn/1570.lco
Error-23104 Error Opening required code-set conversion object file


This error occurs when the following conversion file that is set for the Db_locale and Client_locale character set does not exist. Of course, conversions are required only if the character sets of the Db_locale and Client_locale are inconsistent, and no 23104 error occurs if they are consistent.


-$InformixDIR/gls/cv9/DDDD.CVO -$InformixDIR/GLS/CV9/DDDD.CVO


Where: 16 binary value corresponding to the Client_locale character set encoding



dddd 16 binary values corresponding to the Db_locale character set encoding



To illustrate:


Db_locale = En_us.utf8 # (correspondingto zh_cn.gb18030-2000 # (corresponding 16 encoding: 15702 files must exist, the missing arbitrary file will be reported error-23104.  -$INFOMRIXDIR/gls/cv9/ E01C1570.CVO -$INFOMRIXDIR/GLS/CV9/1570E01C.CVO
Error-23197 Database locale information mismatch


A 23197 error occurs when the following conditions occur.


    1. The Db_locale value defined is inconsistent with the value used by the database (the Db_locale value used when the database was created);
    2. The definition of Db_locale value by SET COLLATION statement is inconsistent with the value of database;


To illustrate:


Database locale= en_us.8859-1 = ' dbname ' -23197 Error Db_locale = ZH_CN.GB
error-201,-202 Database hints syntax error


error-201,-202 Database hints syntax error, Chinese object name is not supported, such as Chinese table name, field alias, view name. The reason for this type of error is the Db_locale Setup problem for the current database.



If the db_locale of the database is set to ZH_CN. gb18030-2000, the database can support Chinese object names.



To illustrate:


Db_locale = Zh_cn. gb18030-2000 Database, can support the following Chinese object name, otherwise will prompt syntax error. Select C1 The first column from TEST_CN;  Create table Chinese name (C1 integer, Chinese column name integer);  drop table Chinese name  ; = ' dbname '
Garbled problem


The Informix character is garbled, or the Chinese characters are not displayed correctly. The problem is because the value set by the client Client_locale is inconsistent with the Db_locale value, and the corresponding character set between the two does not convert correctly. You need to reset the values of Client_locale and Db_locale to make sure that they are consistent or can be converted correctly to each other.


Time format issues


The time format of the Informix database is controlled by the database server-side environment variable Gl_date gl_datetime, and the default time format under the default character set is:


Gl_date= "%m/%d/%iy" DATETIME= "%iy-%m-%d%h:%m:%s"


However, when we set the Db_locale to ZH_CN.GB, without setting the Gl_date,datetime, the time format takes the Client_locale value, and in ZH_CN.GB case it appears: "October 2, 2009" Date format, if we previously used the default time format, there would be a time format mismatch error. If we still need to use the default time format, we need to modify the time format environment variables on the database server:


Gl_date= "%m/%d/%iy" DATETIME= "%iy-%m-%d%h:%m:%s"




GLS Requirements for CSDK version


CSDK2.8 and above (currently the latest version is CSDK3.5), in order to correctly support the processing of language text, the Informix GLS language environment requires the correct setting of the database server language environment and client language environment. In the Chinese language environment, we should set the server-side and client language environment as follows.



Database service side:



When you create a database (in order to unify the character set of the system database and the application database when you create the DB instance), set the Db_locale value for the database as follows.


1. Set the environment variable Db_locale  set Db_locale=zh_cn. gb18030-2000  2. Create DATABASE dbname 3. Verify the current database character set SELECT dbs_collate from Sysmaster: Sysdbslocale = ' dbname '


Client:



When we use ODBC,JDBC to connect to the database, we need to set the locale variables correctly in the connection information: Db_locale and Client_locale.


set the locale variable Db_locale=zh_cn. gb18030-2000 client_locale=zh_cn. gb18030-2000
A database cannot exist independently from a character set, and it must belong to a certain character set. CSDK2.7 version, IDS uses garbage in, the garbage out mode to handle Chinese characters by default, and if Db_locale on the database server uses the default en_us.8859-1 character set, it can support Chinese characters normally. However, when upgrading to CSDK2.8 and later versions, garbage in and garbage out mode are no longer supported, garbled problems will occur. The database for the iso-8859_1 character set supports only single bytes. Although it is possible to store multiple bytes of data inside, the database itself simply considers the data to be a single-byte data. Whether you're depositing double-byte Chinese or 4-byte Unicode data, the database itself only has it as a single byte. After the   client connects to the database through JDBC and extracts the data, the data is extracted to the client by the default encoding of the database. The client wants to convert it to Unicode. Because JDBC is Java-based, and Java is all Unicode encoded internally. So when the client is converted, there is a problem, in which format to go? This is true if the data saved in the database is a character contained in the Iso-8859_1. But if there is Chinese in the data, then it is wrong to do so. For example, the ASCII code for Chinese characters is 0xd6 0xD0 0xCE 0xc4, and the commonly used Chinese character codeset is cp936. If you follow cp936 to Unicode, the result of the conversion is 0xe4 0xb8 0xAD 0xe6 0x96 0x87; If you follow the iso-8859_1 conversion, the result is 0xC3 0x96 0xC3 0x90 0xC3 0x8E 0xC3 0x84. You see, the same number It is because the different original character sets are specified, resulting in a distinct result when converted to Unicode. So when using JDBC to connect to the database, it is necessary to make this transition relationship clear. In Informix JDBC, the problem is explained by Newcodeset. Here, we need to make it clear that the database is the character set to hold the data, when taken to the client Java, the data needs to be converted to Unicode according to what character set.


or change the character set of the database (set DB_LOCALE=ZH_CN. gb18030-2000, re-create the database), and then follow the methods described in this article to Db_locale and Client_locale the Setup method for processing. If the cost of rebuilding a database is too high in the real world, consider the following steps to solve the problem of ODBC support for Chinese.


1. Set environment variables: ifmx_undoc_b168163=1  2. Copy the en_us.8859-1 character set file to the ZH_CN directory     CD $INFORMIXDIR/gls/lc11      CP. /en_us/0333.lco./zh_cn 3. Restarting the IDS client: Setting the locale:     l db_locale=zh_cn. gb18030-2000 l client_locale=zh_cn. gb18030-2000


For JDBC we can solve this problem by Newcodeset:



The Newcodeset format is as follows:
Newcodeset=jdk codeset, Ifx Codeset,ifx codenum

The URL link information for JDBC is as follows:



URLString = "Jdbc:informix-sqli://9.125.66.130:6346/dbname:informixserver=servername; newcodeset=gb18030-2000,8859-1,819; client_locale=en_us.8859-1; db_locale=en_us.8859-1; "





Summarize


In the Simplified Chinese application environment, please use ZH_CN. The gb18030-2000 character set, which can be used in an application environment that requires Unicode, may be in the En_us.utf8 character set.



Example: Using zh_cn. gb18030-2000 Character Set


database server side: Set the environment variable Db_locale =ZH_CN before creating the DB instance. gb18030-2000 client_locale=zh_cn. gb18030-2000 CREATE DATABASE Database client side: You need to configure Db_locale =ZH_CN in Setnet32, ODBC, and JDBC connections. gb18030-2000 client_locale=zh_cn. gb18030-2000


Example: Using the UTF8 character Set


database server side: Set the environment variable Db_locale=En_us.utf8  client_locale= En_us.utf8 before creating the DB instance Create DATABASE Database client side: Need to configure Db_locale= En_us.utf8  client_locale= En_us.utf8 in Setnet32, ODBC, JDBC connection


In particular, please set the environment variable Db_locale to the planned character set before installing the database, because once the database is created, it cannot modify its character set unless it is re-created. By default, En_us.8859-1 is used.







How to properly set up Informix GLS and CSDK locales


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.