Database character set and encoding conversion

Source: Internet
Author: User
Tags postgresql create database postgresql client

The database management system supports some kind of coding, which mainly involves three aspects:

Database server support.

Data access interface support.

Client tool support.

 

1 Database Server character encoding :

The database server supports some kind of encoding, meaning that the database server can receive, store, and supply the encoded characters (including identifiers, character field values) to clients from the client, and can convert the encoded characters to other encodings (such as UTF-8 encoding to GBK encoding).

1.1 Specify the database server encoding:

Postgresql:

Specify when creating database:

CREATE DATABASE ... ENCODING ...

You can take ASCII, UTF-8, EUC_CN 、......

1.2 View database encoding

Postgresql:

Show server_encoding

2 Database Access interface encoding

The data access interface supports some kind of encoding, and the interface should be able to read and write the encoded characters correctly, and should not lose data and distort data.

Take the JDBC interface for example:

JDBC interfaces are typically set Client_encoding,set client_encoding to file_encoding according to the JVM's file.encoding.

Converts a string to a client_encoding-encoded byte stream, which is passed to the server side, the prototype string.getbytes (client_encoding).

After you receive the byte stream for the server, use client_encoding to construct the string object as the return value of the GetString to the application, prototype string (byte[], ..., client_encoding)

3 Client Code

The client tool supports some kind of encoding, must be able to display the encoded characters read from the database, and can also submit the encoded characters to the server side through this tool.

3.1 PostgreSQL client-side encoding of the specified session

SET client_encoding to ' value '

3.2 View Database encoding

Show client_encoding

4 View binary strings with different encoding characters

Here are the binary storage strings of several characters in the database under different encodings, PostgreSQL select decoding (name, ' escape ') from test to view the binary strings in the database server.

4. 1 Take "shell steel" as an example

GBK encoded as: B1B4 B8D6

UTF-8 encoded as: e8b49d E992A2

GB18030 encoded as: B1B4 B8D6

4. 2 Taking "" as an example
GBK encoded as: FE57 FE54

UTF-8 encoded as: eea09c EEA099

GB18030 encoded as: 8336c9388336c935

5 Code Conversion Example

The following is a concrete example to see, in this example, the client uses gbk/gb18030 encoding, the interface uses GBK18030 encoding at both ends, the database server uses UTF-8 encoding:



Conversion involves:

Conversion between encoding and connecting client encodings in an application

Connecting between server-side encoding and database server encoding

In the image above, the orange-Red Arrows indicate

For example, the binary strings in the database server under different encodings are:

GBK encoded as: FE57 FE54

UTF-8 encoded as: eea09c EEA099

GB18030 encoded as: 8336c9388336c935

Socket:

The programming interface guarantees that the character encoding sent to the server side is consistent with the client_encoding of the current session.

Client_encoding can be set to the current encoding of the characters obtained from the application

You can also get the client_encoding of the current session, converting the characters obtained from the application into the client_encoding set encoding

Server:

The conversion between client_encoding and server_encoding

According to the conversion of the database code conversion algorithm, the method in the target coding is converted to the question mark ""

6 problems encountered in peacetime

The character was incorrectly coded to parse, resulting in garbled characters.

Characters are present in two character sets, causing this part of the character to become "

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.