International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > PostgreSQL

SQL Server Data migration to PostgreSQL explanation of errors and solutions _postgresql

Last Update:2017-01-18 Source: Internet

Author: User

Tags control characters postgresql printable characters

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Problem Recurrence:

1, PG Client:

postgres=# CREATE TABLE Text_test (ID int,info text);
CREATE TABLE
postgres=# insert INTO text_test values (1,e ' \0x00 ');
Error:invalid byte sequence for encoding "UTF8": 0x00

2. SQL Server generates data

CREATE TABLE Test_varchar (ID int,name varchar ());
INSERT into Test_varchar values (1, ' name ' + char (0));
INSERT into Test_varchar values (1, ' name ' + ');

You can then get the data through a Java program and insert it into PG, which also gets the error message:

Invalid byte sequence for encoding "UTF8": 0x00

First of all, we think that when the gb2312 is converted to UTF8, there is an error that cannot be converted. The UTF8 is variable length, 1-6 bytes. His coding rules are as follows:

" td>111110xx

Bits	Last code point	Byte 1	Byte 2	Byte 3	Byte 4	Byte 5	Byte 6
7	u+007f	0xxxxxxx
11	u+07f F	110xxxxx	10xxxxxx
16	U+ffff	1110xxxx	10xxxxxx	10xxxxxx
21st	U+1fffff	11110xxx	10xxxxxx	10xxxxxx	10xxxxxx
	u+3ffffff	10xxxxxx	10xxxxxx	10xxxxxx
31	U+7fffffff	1111110x	10xxxxxx	10xxxxxx	10xxxxxx	10xxxxxx	10xxxxxx

And 0x00 is in line with UTF8 rules. This makes us very surprised. And then we found that there were two points that confirmed the problem:
1,

PostgreSQL doesn ' t support storing NULL (\0x00) characters in text fields (this is obviously different from the database N Ull value, which is fully supported).

If you are need to store the NULL character, your must use a Bytea Field-which should store anything you want, but won ' t supp ORT text operations on it.

Given that PostgreSQL doesn ' t support it in text values, there ' no good way to get it to remove it. You are could import your data into Bytea and later convert it to the text using a special function (in Perl or something, maybe?) , but it ' s likely going to is easier to do this in preprocessing for you load it.

source:http://stackoverflow.com/questions/1347646/ postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0

terminating character	indicated by
Tab	\ t This is the default field terminator.
NewLine character	\ n This is the default row terminator.
Carriage Return/line Feed	\ r
Backslash1	\\
Null Terminator (nonvisible Terminator)2	The
Any printable character (control characters are not printable, except null, tab, newline, and carriage return)	(*, A, T, L, and so on)
String of up-printable characters, including some or all of the terminators listed	(\t, end,!!!!!!!!!!, \t-\n, and "on")

Source:http://msdn.microsoft.com/en-us/library/ms191485.aspx

As a result, we are sure that the PG-NULL processing is not the same as SQL Server processing, so there is an error here.

The PG-specific code that causes the problem is as follows (Src/backend/utils/mb/wchar.c's Pg_verify_mbstr_len):

if (!is_highbit_set (*MBSTR))
    {
      if (*mbstr!= ' ")
      {
        mb_len++;
        mbstr++;
        len--;
        Continue;
      }
      if (NOERROR)
        return-1;
      Report_invalid_encoding (encoding, MBSTR, Len);

#define IS_HIGHBIT_SET (CH)   (unsigned char) (CH) & Highbit)
#define Highbit         (0x80)

The Report_invalid_encoding function is to return the error message, which is

Invalid byte sequence for encoding "UTF8": 0x00
And the real cause of this problem is:
!is_highbit_set (*MBSTR) when *mbstr for 0x00, and then to judge whether *mbstr is a, when for the, directly into the function report_invalid_encoding error.

So this problem occurs because PG and SQL Server treat null differently.

Processing scheme:

1, the SQL Server source data to modify the method,

Update:this seems to work:
 
Select * from TABLE
where UNICODE (SUBSTRING (Naughtyfield, LEN (Naughtyfield), 1)) = 0
   so:
 
Update TABLE
SET Naughtyfield = SUBSTRING (Naughtyfield, 1, LEN (Naughtyfield)-1)
where UNICODE ( SUBSTRING (Naughtyfield, LEN (Naughtyfield), 1) = 0
source:http://stackoverflow.com/questions/3533320/ Sql-server-remove-end-string-character-0-from-data

2, the application to modify, to obtain SQL Server data, the data conversion, and the first method of the same.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

export data from sql server to postgresql import data from sql server to postgresql use to handle errors in sql server migrate sql server to postgresql convert sql server to postgresql database systems introduction to databases and data warehouses solutions sql server database migration checklist

MyBatis calls PostgreSQL stored procedures to implement array... 01-18

Use custom Typehandler in MyBatis to work with JSON type fiel... 05-03

PostgreSQL directory structure and modify data directory 08-27

postgresql:prepared statement "S_1" already exists 11-05

PowerDesigner Reverse Engineering PostgreSQL Database 05-15

Installation PostgreSQL appears Configure:error:readline libr... 08-03

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

SQL Server Data migration to PostgreSQL explanation of errors and solutions _postgresql

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support