14 Tips for high-performance Java persistence

Source: Internet
Author: User
Tags connection pooling generator

A high-performance data access layer requires a lot of knowledge about the database internals, JDBC, JPA, Hibernate, and this article summarizes some of the important technologies that can be used to optimize enterprise applications.

1. SQL statement Log

If you use a framework that generates statements that conform to your own habits, you should always validate the validity and efficiency of each statement. It is better to test with an assertion mechanism because you can catch n + 1 query problems even before committing the code.

2. Connection Management

The connection cost of the database is very high, so you should always use the connection pooling mechanism.

Because the number of connections is given by the functionality of the underlying database cluster, you need to release the connection as quickly as possible.

In performance tuning, you always have to measure, set up the right connection pool, and the pool is about the same size. But tools like Flexypool can help you find the right size, even if you've deployed your application to a production environment.

3.JDBC Batch Processing

The JDBC batch allows us to send multiple SQL statements in a single database round trip. Performance gains are important both on the driver and the database side. Preparedstatements is ideal for batch processing, and some database systems, such as Oracle, only support batches for preprocessing statements.

Because JDBC defines unique APIs for batching (for example, Preparedstatement.addbatch and Preparedstatement.executebatch), if you generate statements manually, you should know from the beginning whether batches should be used. With hibernate, you can switch to batch processing with a single configuration.

Hibernate 5.2 provides session-level batching, so it is more flexible in this area.

4. Statement caching

Statement caching is one of the least-known performance optimizations you can easily take advantage of. Depending on the underlying JDBC driver, preparedstatements can be cached on the client (driver) or database side (syntax tree or even execution plan).

5.Hibernate identifiers

When hibernate is used, the identity generator is not a good choice because it disables the JDBC batch process.

The table generator is worse because it uses a separate transaction to get a new identifier, which can be stressful for the underlying transaction log and the connection pool, because a separate connection is required each time we need a new identifier.

Sequence is the right choice to support SQL Server even from version 2012. For sequence identifiers, Hibernate always provides an optimizer, such as pooled or Pooled-lo, which reduces the number of database round trips required to obtain a new entity identifier value.

6. Select the correct column type

You should always use the correct column type on the database side. The more compact the column type, the more entries can be accommodated in the database work set, and the index will be better adapted to memory. To do this, you should take advantage of database-specific types (such as inet of IPV4 addresses in PostgreSQL), especially when implementing new custom types, hibernate is very flexible.

7. Relationship

Hibernate has many relational mapping types, but not all relational mapping types are equally efficient.


You should avoid one-way collections and @manytomany lists. If you do need to use an entity collection, you prefer bidirectional @OneToMany Association. For @manytomany relationships, use Set (s), because they are more efficient in this case, or simply map a linked many-to-many table and convert the @manytomany relationship to two bidirectional @OneToMany associations.

However, unlike queries, collections are not flexible enough because they are not easily paged, which means that when the number of child associations is quite high, we cannot use them. For this reason, you should consider whether a collection is really necessary. In many cases, entity queries may be a better choice.

8. Inheritance

As far as inheritance is concerned, the mismatch between object-oriented languages and relational databases becomes more pronounced. JPA provides single_table,joined and table_per_class to handle inheritance mappings, and each strategy has its pros and cons.

Single_table is best in terms of SQL statements, but because we cannot use the NOT NULL constraint, we fail in terms of data integrity.

When more complex statements are provided at the same time, joined employs data integrity restrictions. This strategy is fine as long as you don't use a basic type of polymorphic query or @onetomany Association. Its real function is to correlate the polymorphic @manytoone supported by the policy pattern on the data access layer.

You should avoid using table_per_class because it does not generate a valid SQL statement.

9. The size of the persistence context

When using JPA and Hibernate, you should always focus on the size of the persistence context. For this reason, you should not use managed entities too much. By limiting the number of managed entities, we can get better memory management, and the default check mechanism will be more efficient.

10. Crawl only what is necessary

Getting too much data can be the primary cause of problems with data access layer performance. One problem is that, even for read-only projections, entity queries are private.

DTOs projections are better suited for getting custom views, and entities can only be obtained if the business flow needs to be modified.

Eager crawl is the worst, you should avoid anti-patterns (anti-pattern), such as Open-session in View.

11. Cache


The relational database system uses many memory buffer structures to avoid disk access. Database caches are often overlooked. We can significantly reduce response time by properly tuning the database engine so that the working set resides in memory, not always from disk.

Application-level caching is not optional for many enterprise applications. Application-level caching can reduce response time, while shutting down the database for maintenance or providing read-only secondary repositories due to some serious system failure.

Secondary caches are useful for reducing read-write transaction response times, especially in the master-slave replication architecture. Hibernate allows you to choose between Read_only,nonstrict_read_write,read_write and transactional, depending on the application's requirements.

12. Concurrency control

In terms of performance and data integrity, the choice of the transaction isolation level is very important. For multi-request web processes, to avoid missing updates, you should use optimistic locking for detached entities or EXTENDED persistence contexts.

To avoid optimistic locking false positives, you can split entities using either version-free optimistic concurrency control or read-write-based property sets.

13. Release database query function

Just because you use JPA or hibernate does not mean that you should not use a native query. You should take advantage of window functions, CTE (common table expressions), and CONNECT By,pivot queries.

These constructs allow you to avoid getting too much data to be converted later at the application layer. If you can get the database to work, you can only get the final result, so you save a lot of disk I/O and network overhead. To avoid primary node overloading, you can use database replication and have multiple slave nodes so that data-intensive tasks are performed on the subordinate node rather than on the master node.

14. Landscape Scaling and scaling

The scalability of the relational database is very good. If Facebook, Twitter, Pinterest, or StackOverflow can extend their database systems, chances are that you can extend your enterprise applications to their specific business needs.

Database replication and sharding are good ways to improve throughput, and you should be able to leverage these tested architectural patterns to extend your enterprise applications.

Conclusion

The high-performance data access layer must respond to the underlying database system. Understanding the internal workings of relational databases and the data access framework in use can make a difference between enterprise high-performance applications and applications that have little crawls.

Welcome to join the Learning Exchange Group 569772982, we learn to communicate together.

14 Tips for high-performance Java persistence

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.