Processing of CLOB, blob, and JSON types when JPA is ported to PostgreSQL

Source: Internet
Author: User
Tags postgresql

First, the summary

The current project was originally based on Oracle development and is now being ported to PostgreSQL. Given that JPA has been used to achieve object persistence, the leader has always thought that the migration task is easy, but there are still a lot of problems in the actual process.

Some of these problems are caused by nonstandard definitions of EJBS, such as mapping number (1) to Boolean, number (n) to string, date mapping to string, and so on. Because Oracle has a powerful automatic type conversion capability, Oracle does not error if the data conforms to the format, and once ported to the PostgreSQL environment, various types of mismatched exception are thrown away. However, as long as the standard one by one fixed, these problems are easy to solve.

There are several other issues that can be frustrating, especially with regard to CLOB, blob, and JSON types. When porting to PostgreSQL, it is never simple to replace CLOB with text, blob to Bytea, Varchar2 (...). CONSTRAINT ... CHECK (..... is JSON), the rest is given to JPA after it is replaced with JSON. The next trouble has to go to one by one to solve.

This article is based on the search for information, coupled with their own experience in the process of doing some practical summary.

Second, CLOB and BLOB processing 1 phenomenon

For Clob (PostgreSQL corresponds to text, the following does not make a distinction) type, write without exception, but actually stored a numeric value, rather than byte[] content; Read, part of the line is correct, part of the line is the exception: column XXX is the type text but Expression is of type bigint ... For the blob (corresponding to Bytea, no distinction) type, simply write an error: Column xxx is type of bytea but expression is type of bigint ....

2 PostgreSQL two ways to process LOB data

To explain why, you first need to know how PostgreSQL handles LOB data in two ways: OID + Largeobject and byte[], detailed instructions see:

Reference 1:https://jdbc.postgresql.org/documentation/80/binary-data.html

Obviously, JPA passes the expected binary array mode as Oid+largeobject to PostgreSQL, so when it encounters a write to Clob or blob, the corresponding field is actually the value of the OID (bigint type), and byte[] The value is written to the public pg_largeobject table. The difference is that the OID of the bigint type was converted to text successfully, and the conversion to Bytea failed.

To verify this, the value that has been deposited into the text field to pg_largeobject the query is indeed the desired byte[], which is why some of the JPA reads are successful, and for unsuccessful records, the guesses may be related to the number of bytes, Because the number of bytes more than 1M succeeds and fails at the K level (the critical value is unknown). In view of the space, these contents do not expand, interested people please self-verification.

OID + Largeobject In addition to some of the advantages of performance, there are at least three disadvantages: 1 common pg_largeobject exist permissions problems, 2 pg_largeobject corresponding records will not be deleted automatically with the source record, 3 has a more restrictive transaction. Therefore, it does not meet the project requirements, but why is JPA always treated as OID + largeobject?

3 The disunity of Hibernate and PostgreSQL

In Bytea, for example, the two methods of PostgreSQL are implemented by invoking the Setbinarystream () and Setblob () interfaces of JDBC respectively. The expected logic should be that hibernate correctly distinguishes and correctly invoke this feature of PostgreSQL, but unfortunately Hibernate thinks that all databases are called Setbinarystream () to write Bytea, For some reason it is not intended to take care of PostgreSQL's special circumstances (seemingly not to change over time), so the aforementioned error has occurred.

For a detailed explanation, see:

Reference 2:http://www.codeweblog.com/postgresql-hibernate-on-bytea-and-oid-mapping-problem/

As for text, the situation is roughly the same, except that the other two JDBC interfaces are called and are no longer expanded.

4 Solutions

or in reference 2, two solutions for Bytea are proposed:

    1. When you define an EJB, change the blob type to byte[];
    2. Overloads the Useinputstreamtoinsertblob () method of the Postgresdialect class in Hibernate.

The two ideas are not very successful, and may be related to version differences and environmental differences, and some other factors need to be modified. Not going further, partly because of the urgency of the time, in part because the text was solved by passing it (see below).

For text, Data 3 presents three ways to solve the problem:

Reference 3:https://stackoverflow.com/questions/28588311/ Correct-jpa-annotation-for-postgresqls-text-type-without-hibernate-annotations

    1. When defining an EJB, cancel the @lob callout and treat it as string;
    2. When you define an EJB, retain the @lob callout, adding@Type(type = "org.hibernate.type.TextType")标注;
    3. Do not modify the EJB, overload the Postgresdialect class Remapsqltypedescriptor () method, and Clob when LongVarChar is processed.

After the actual test, three kinds of ideas can achieve the goal. Because the leader does not approve of the way to change the label (the reason is inconsistent with the version of Oracle Environment), then use the idea 3, along with the blob by longvarbinary processing.

Because the project PostgreSQL is 9.4 version, so choose to inherit from the Postgresql94dialect, under normal circumstances can choose Postgresql9dialect. The code is:

 PackageCom.xxx.pgdialect;Importjava.sql.Types;ImportOrg.hibernate.dialect.PostgreSQL94Dialect;ImportOrg.hibernate.type.descriptor.sql.LongVarbinaryTypeDescriptor;ImportOrg.hibernate.type.descriptor.sql.LongVarcharTypeDescriptor;ImportOrg.hibernate.type.descriptor.sql.SqlTypeDescriptor; Public classPgdialectextendspostgresql94dialect{@Override Publicsqltypedescriptor remapsqltypedescriptor (sqltypedescriptor sqltypedescriptor) {Switch(Sqltypedescriptor.getsqltype ()) { CaseTypes.clob:returnlongvarchartypedescriptor.instance;  CaseTypes.blob:returnlongvarbinarytypedescriptor.instance; }        return Super. Remapsqltypedescriptor (Sqltypedescriptor); }}

You can then replace the postgresql94dialect with this class (Com.xxx.pgdialect.PgDialect) in Persistence.xml.

Third, the processing of JSON

What is special about JSON types is that first Oracle is actually treated as a special varchar2 or CLOB, followed by Hibernate and JDBC, which do not have a JSON type defined. Therefore, in the case of "column XXX is the type of JSON but expression is type of character varying ..." exception, the aforementioned method cannot be simply copied.

By Google, a valuable piece of information has been found:

Reference 4:https://stackoverflow.com/questions/15974474/mapping-postgresql-json-column-to-hibernate-value-type

There are many kinds of solutions, which are briefly summarized as follows:

    1. When defining the PostgreSQL table structure, change the JSON to text, which is modeled after Oracle's approach;
    2. Expand the type in hibernate to increase the custom type of JSON, while adding (or overloading) the appropriate method to process the JSON;
    3. Replace the JDBC driver with Pgjdbc-ng, which provides @conveter annotations that can handle JSON and text conversions;
    4. In the PostgreSQL database, create an implicit or explicit type conversion method or function that allows PostgreSQL to accept the automatic conversion of JSON and text.

Idea 1 requires the application to ensure that the data conforms to the JSON specification, and that the risk is large and rejected. Ideas 2 There are a number of specific ways of implementation (interested in self-study), but the common denominator is the need to modify the EJB annotations, was led to veto. Thinking 3 is too dependent on a product, and the same as 2 to modify the label, also rejected. There is only 4 of ideas left, and in fact it is the easiest way to do it.

On the psql command line, you can simply create four casts of the text and JSON, varchar, and JSON to each other:

CREATE CAST(text  asJSON) withINOUT asassignment;CREATE CAST(JSON as text)   withINOUT asassignment;CREATE CAST(varchar  asJSON) withINOUT asassignment;CREATE CAST(JSON as varchar)   withINOUT asassignment;

After execution, there is no "column xxx is the type of JSON but expression is type of character varying ..." exception.

Further guessing, XML types can also be handled in a similar way.

Iv. Summary
    • When defining EJBS, it is necessary to standardize and avoid most simple type mismatch errors;
    • For CLOB and blobs, treat them as LongVarChar and longvarbinary;
    • For JSON, add an implicit or explicit type conversion method.

Processing of CLOB, blob, and JSON types when JPA is ported to PostgreSQL

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.