14 skills for high-performance Java persistence, high-performance java

Last Update:2017-11-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A high-performance data access layer requires a lot of knowledge about the database, JDBC, JPA, and Hibernate. This article summarizes some important technologies that can be used to optimize enterprise applications.

1. SQL statement log

If you use a framework to generate statements that meet your usage habits, you should always verify the effectiveness and efficiency of each statement. It is better to use the assertion Mechanism during testing because N + 1 query problems can be captured even before code is submitted.

2. Connection Management

The connection overhead of the database is very high, so you should always use the connection pool mechanism.

Since the number of connections is provided by the underlying database cluster function, you need to release the connection as soon as possible.

In performance tuning, you always need to measure and set the correct connection pool, and the pool size is similar. But tools like FlexyPool can help you find the right size, even if you have deployed the application to the production environment.

3. JDBC Batch Processing

JDBC batch processing allows us to send multiple SQL statements in a single database round-trip. Performance gains are important in both the driver and database. PreparedStatements is very suitable for batch processing, while some database systems (such as Oracle) only support batch processing for pre-processing statements.

Jdbcdefines unique api(example: preparedstatement.addbatchand preparedstatement.exe cuteBatch) for batch processing. If you manually generate statements, you should know from the very beginning whether batch processing should be used. With Hibernate, you can switch to batch processing with a single configuration.

Hibernate 5.2 provides session-level batch processing, so it is more flexible in this regard.

4. Statement Cache

Statement caching is one of the least-known performance optimizations you can easily use. Based on the basic JDBC driver, you can cache PreparedStatements on the client (driver) or database (syntax tree or even execution plan.

5. Hibernate identifier

When Hibernate is used, the IDENTITY generator is not a good choice because it disables JDBC batch processing.

The TABLE Generator is worse because it uses a separate transaction to obtain new identifiers, which puts pressure on underlying transaction logs and connection pools, because each time we need a new identifier, we need a separate connection.

SEQUENCE is the right choice, and even SQL Server is supported from version 2012. For SEQUENCE identifiers, Hibernate always provides optimizers, such as pooled or pooled-lo, which can reduce the number of database round trips required to obtain new object identifiers.

6. Select the correct column type

You should always use the correct column type on the database side. The more compact the column type, the more entries that can be accommodated in the database work set, and the better the index is to adapt to the memory. To this end, you should use database-specific types (such as the IPv4 address inet in PostgreSQL), especially when implementing new custom types, Hibernate is very flexible.

7. Relationship

Hibernate has many relational ing types, but not all relational ing types are equally efficient.

Avoid unidirectional set and @ ManyToMany list. If you really need to use an object set, the two-way @ onetoworkflow Association is preferred. For the @ ManyToMany relation, use Set (s), because in this case they are more efficient, or simply map the connected multiple-to-multiple tables, the @ ManyToMany relationship is converted into two bidirectional @ onetoworkflow associations.

However, different from queries, collections are not flexible enough because they are not easily paged, Which means we cannot use them when the number of subassociations is quite high. For this reason, you should consider whether a set is really necessary. In many cases, object query may be a better choice.

8. Inheritance

In terms of inheritance, the mismatch between object-oriented languages and relational databases becomes more obvious. JPA provides SINGLE_TABLE, JOINED, and TABLE_PER_CLASS to process inheritance ing. Each policy has its advantages and disadvantages.

SINGLE_TABLE has the best performance in SQL statements, but because we cannot use the NOT NULL constraint, we fail in data integrity.

When more complex statements are provided at the same time, JOINED uses data integrity restrictions. This policy is fine as long as you do not use a basic type of multi-state query or @ onetoworkflow Association. Its real role lies in the multi-state @ ManyToOne Association supported by the Policy mode on the data access layer.

Avoid using TABLE_PER_CLASS because it does not generate valid SQL statements.

9. Persistent context size

When using JPA and Hibernate, always pay attention to the size of the persistent context. For this reason, you should not use too many managed entities. By limiting the number of managed entities, we can achieve better memory management, and the default check mechanism will be more efficient.

10. Only capture necessary things

Too much data may be the primary cause of performance issues at the data access layer. One problem is that object queries are dedicated even for read-only Projections.

DTO projections is more suitable for obtaining custom views, and entities can only be obtained when the business flow needs to be modified.

EAGER crawling is the worst. You should avoid Anti-Pattern, such as Open-Session in View.

11. High-speed cache

The relational database system uses many memory buffer structures to avoid disk access. Database cache is often ignored. We can adjust the database engine appropriately to significantly reduce the response time so that the working set can reside in the memory instead of being retrieved from the disk all the time.

Application-level caching is not optional for many enterprise applications. Application-level cache can reduce the response time and provide read-only secondary repositories for database shutdown for maintenance or due to some serious system faults.

The secondary cache is very useful for reducing the response time of read/write transactions, especially in the master-slave replication architecture. According to application requirements, Hibernate allows you to select between READ_ONLY, NONSTRICT_READ_WRITE, READ_WRITE and TRANSACTIONAL.

12. Concurrency Control

In terms of performance and data integrity, the choice of transaction isolation level is very important. For multi-request Web flows, to avoid loss of updates, you should use optimistic Locking for the separated entity or EXTENDED persistent context.

To avoid false positives of optimistic locking, you can use a no-version optimistic concurrency control or a read/write-based Attribute Set to split entities.

13. Release the database query function

Because you use JPA or Hibernate, it does not mean that you should not use native queries. You should use window functions, CTE (common table expressions), connect by, and distinct queries.

These structures allow you to avoid getting too much data for later conversions at the application layer. If the database can be processed, only the final result can be obtained, which can save a lot of disk I/O and network overhead. To avoid heavy load on the master node, you can use the database to copy and have multiple slave nodes, so that data-intensive tasks are executed on the slave node instead of the master node.

14. HorizontalScaling and Vertical Scaling

Relational databases have excellent scalability. If Facebook, Twitter, Pinterest, or StackOverflow can expand their database systems, it is very likely that you can extend your enterprise applications to their specific business needs.

Database Replication and sharding are a good way to increase throughput. You should be able to use these tested architecture models to expand your enterprise applications.

Conclusion

The high-performance data access layer must respond to the underlying database system. Understanding the internal working principles of relational databases and the data access framework in use can lead to a gap between enterprise high-performance applications and applications with almost no crawls.

Join the study exchange group 569772982 to learn and exchange.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

14 skills for high-performance Java persistence, high-performance java

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

14 skills for high-performance Java persistence, high-performance java

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support