Analysis and Solution to the speed bottleneck of Java Database Access
Source: Internet
Author: User
The speed bottleneck problem is raised in enterprise-level Java applications. database access is a necessary link. As a distribution center for data resources, databases are often located behind the enterprise-level software system for application access in front. In the Java System, applications access the database through the JDBC (Java database connectivity) interface. JDBC supports basic functions such as "establishing connections, querying SQL statements, and processing results. When the JDBC interface is used to access the database, as long as the operation is performed according to the specifications, the implementation of these functions will not go wrong. However, in some cases, the efficiency of data query is really annoying for developers. The program compiled according to the specifications is not as effective as expected, resulting in a low execution efficiency of the entire software. At first, we attributed the problem to the slow loading and execution speed of Java bytecode, followed by the general enhancement of hardware functions, proving that such an idea was a bit wrong, there is no real root cause yet. This article will gradually dissect the mechanism of JDBC access to the database, deeply analyze the causes of this speed bottleneck problem, and propose ideas and methods to solve this speed bottleneck problem under the existing Java technical framework. Figure 1 of the mechanism for JDBC database access Figure 2 Figure 1 and figure 2 describe the Four driver modes for Java applications to access the database through the JDBC interface, that is, the underlying mode for implementing the JDBC interface. We will introduce these models one by one: Mode 4: The branch on the left of Figure 1 is called Mode 4, which is generally a local protocol-based driver of pure Java that can be implemented by the database vendor, directly calling the network protocol used by the DBMS (Database Management System) is a practical solution for the enterprise's internal internet. Mode 3: The branch on the Right of Figure 1 is called Mode 3, which is also a pure Java driver. Different from mode 4, it is based on the network protocol. The Mechanism is to convert JDBC calls to the intermediate network protocol and then to the DBMS protocol. The protocol layer of the intermediate network acts as a middleware for reading databases and can connect to many types of databases. Therefore, it is the most flexible JDBC mode. This model is applicable to the internal Internet of enterprises. If you support the international Internet, you need to add support for security and access through the firewall. Mode 1: The branch on the left of Figure 2 is called Mode 1, which is usually the JDBC-ODBC bridge provided by Sun. It provides a JDBC interface for access through one or more ODBC drivers. ODBC drivers, in many cases, that is, the database client, must be loaded to the client. Therefore, it is applicable when downloading and automatically installing Java programs are not important, experiment purposes, or other JDBC drivers are not available. Mode 2: The branch on the Right of Figure 2 becomes Mode 2, similar to the JDBC-ODBC Bridge, which needs to be loaded to the client, but is a part of the driver interface implemented in Java. It converts a JDBC call to a call to the client interface of the database (Oracle, Sybase, Informix, DB2, and so on. The modes of JDBC interfaces described above are different, so we can divide JDBC interfaces into four categories by implementation mode. Some colleagues may have such experiences that different JDBC interfaces have different access speeds. Why is this happening? The answer to this question is that different applications require JDBC interfaces of different modes. Therefore, we must carefully select JDBC interfaces when facing an application. Generally, DBMS supports the ODBC specification proposed by Microsoft. Therefore, Mode 1 can be used as your choice when designing and implementing software, the easy-to-configure feature allows you to leave aside the troubles such as choosing JDBC, so that your Java program can work properly as early as possible. Generally, the provider of a commercial DBMS provides a JDBC interface for its own database. The application mode 4 is used. The advantage of this mode is that it is closely integrated with the database itself and is implemented in pure Java. It should be the first choice in enterprise-level software applications. For example, Oracle, SilverStream, DataDirect, and other companies provide such drivers, and their performance is often evaluated as the most efficient and reliable driver. But occasionally there are more troublesome circumstances, such as Microsoft will not provide ms SQL JDBC interface, then you need to go to Sun's website (http://industry.java.sun.com/products/jdbc/drivers) to find the relevant mode 4 driver, the DataDirect Company (http://www.datadirect-technologies.com/jdbc/jdbc.asp) mentioned above provides a mode 4 driver that supports ms SQL, just you need to pay $750 to buy this JDBC driver. It is also implemented in pure Java Mode 3. Compared with Mode 4, the advantage lies in its support for a variety of databases, reflecting its flexibility. In large-scale enterprise-level software applications, the background database is often not one and is supported by different vendors. However, the JDBC driver of Mode 3 often provides many enterprise-level features, such as SSL security, support for Distributed Transaction Processing and centralized management, which will be of great help for your special purposes. Whether to choose it depends on whether you have requirements for extended applications and support for multi-DBMS. Speaking of this, I will make a summary of Modes 3 and 4: both are pure Java-implemented drivers, so no additional software is required from the database vendor, it can run on any standard Java platform with high performance and reliability. After learning about the above three JDBC implementation modes, Mode 2 is easier to explain. You can understand it as a compromise between the advantages and disadvantages of the first three: 1 Reference Mode 1 use the local client code library, it accelerates the execution of data access, but removes the ODBC standard. Instead, it supports performance expansion specified by the vendor. 2. Reference Mode. 3. Multi-layer structure. The upper layer is implemented in Java, it is conducive to cross-platform applications and support for multiple databases, but the lower layer is changed to local code. Acceleration of execution speed 3 draws on the advantages of Mode 4, which is closely integrated with the database, and is partially implemented in Java, this open and high-performance feature has been greatly extended to the database performance, and has been strongly recommended by major database vendors. Although it requires you to download the local code library to the client, the improvement in the speed of accessing the database is just a little effort. Next we will look at the four modes for implementing JDBC and summarize the selection sequence (of course, when you have a choice, if there is no choice, it will be postponed ): number selection process analysis select sequence 1 under the experimental environment, as much as possible to select easy-to-configure drivers, conducive to Java program development, after the application environment can be determined, select 1> 2> 3> 42 for the JDBC mode. In a small enterprise-level environment, no multi-database support is required. Therefore, some advantages of Mode 2 and 3 cannot be reflected, we strongly recommend that you choose Mode 4 JDBC driver 4> 2 = 3> 13 large enterprise-level environments that require multi-database support. Modes 2 and 3 have their own merits, but in more cases, you will choose a fast mode 22> 3> 4> 1. Theoretically, the efficiency of JDBC interfaces provided by different vendors but in the same mode is not high, you can only compare them with certain tools, such as benchmark, to facilitate your selection. Because there are no third-party data comparison results for the moment, you need to have a thorough understanding of the above content and then solve these problems on your own. When optimizing the SQL statement format in Java programs, you may still be unable to find the appropriate JDBC driver, maybe I was excited that the JDBC driver I downloaded had passed the test at AM, but it does not mean that your optimization work on the program is no longer important. Remember, the optimization of the entire software system, including the optimization of each link, or you may not be able to give up. I will not discuss Java program algorithms here, but briefly explain the necessity of selecting the SQL statement format and how to select the SQL statement format that is advantageous to myself. Take a look at the following two program snippets: code fragment 1: String updatestring = "Update coffees set sales = 75" + "where cof_name like 'colombian '" ;stmt.exe cuteupdate (updatestring ); code fragment 2: preparedstatement updatesales = con. preparestatement ("Update coffees set sales =? Where cof_name like? "); Updatesales. setint (1, 75); updatesales. setstring (2, "Colombian"); updatesales.exe cuteupdate (); the difference between Segment 2 and segment 1 is that the latter uses the preparedstatement object, while the former is a common statement object. The preparedstatement object not only contains the SQL statement, but also has been pre-compiled in most cases. Therefore, when executing the statement, you only need to run the SQL statement in the DBMS instead of compiling it first. When you need to execute the statement object multiple times, using the preparedstatement object will greatly reduce the running time and, of course, speed up database access. This conversion also brings you great convenience. You do not have to repeat the syntax of SQL statements. Instead, you only need to change the value of the variable to execute the SQL statement again. Whether the preparedstatement object is selected depends on whether the SQL statements with the same syntax are executed multiple times, and the difference between the two statements is only the difference between variables. If it is executed only once, it should be no different from the normal statement object, and it does not reflect its precompilation superiority. Optimization of the Design Pattern for database access in the software model when I read the J2EE blueprint and the JDO draft, I found the impact of the access pattern on database access, therefore, I want to explain in this article how to select an appropriate software model for my own software requirements. The J2EE blueprint designer uses the MVC (Model-View-Controller) system in the Java pet store sample application, providing a background for many J2EE design patterns. I will talk about three design modes: Data Access Object, Fast Lane reader, page-by-page iterator, they provide some ideas that can be used in the system design stage to accelerate data access. Data Access Object separates the business logic from the data access logic and adapts the accessed resources so that the resources can be easily and independently transformed. Commercial components that depend on the special elements of underlying data resources (such as database suppliers) often combine commercial logic with data access logic and can only use special types of resources, however, it is very difficult to reuse different types of resources. Therefore, it can only serve a limited market. Dao (Data Access Object) abstracts the data access logic from EJB into an independent interface. EJB executes the business logic according to the operation of the interface, the interface is implemented as a DaO object for the data resources used. In the Java Pet Shop example, the orderejb component accesses the database through the Associated orderdao class, and its own focus is on the implementation of business logic. In the scheduling phase, you can configure a certain type of (orderdaocs, orderdaooracle, or orderdaosybase) as the implementation of orderdao, and orderejb does not need to be changed. Figure 3 helps you better understand the truth: Figure 3 Data Access Object design mode increases data access elasticity, resource independence, and scalability, but the complexity is improved accordingly, we will not discuss other issues here. Fast Lane reader discards ejbs to accelerate read-only data access. In some cases, efficient data access is more important than obtaining the latest data. In Java pet store, when a user browses the store's directory, it is not crucial to match the screen with the database content. On the contrary, rapid display and retrieval are very important. The FLR mode can accelerate the re-acquisition of large column data items from resources. Instead of using ejbs, The FLR mode can directly access data through Dao, this eliminates the frequent spending on ejbs (such as remote method calls, transaction management, and data serialization ). In the Java pet store example, when a user browses a directory, he loads data items from the database through catalogdao (instead of catalogejb), while catalogdao is an instance of Fast Lane reader, making read access faster. 4: Figure 4. The design mode of Fast Lane reader is different from the DAO mode. FLR is an optimized mode, but it does not need to replace the original access mechanism, it is used as a supplement to make it complete. When you frequently read large column data and do not need to access the latest data, it is very appropriate to use FLR. In order to efficiently access large remote data columns, page-by-page iterator re-obtains its element as the value object of a sub-column (design mode to improve the efficiency of remote transmission, ). Distributed Database applications often require users to consider a long column of data items, such as a directory or a set of search results. In these cases, it is often unnecessary to provide full-column data immediately (the user is not interested in all data items) or impossible (there is not enough space ). In addition, when a column of data is retrieved again, the cost of using entity bean is very high. One overhead comes from using remote probes to collect requested beans. In addition, A larger overhead comes from a remote call to each bean to obtain data from the user. Through iterator, the client object can re-obtain a sub-column or page value object at once. Each page meets the requirements of the client. Therefore, the program uses fewer resources to meet the immediate needs of the client. In the Java pet store example, the JSP page product. JSP displays only a part of a data column at any time, and obtains the data item again from the productitemlisttag (page-by-page iterator). When the client wants to see another data item in the column, the product. JSP calls iterator again to obtain these data items. The process is shown in Figure 5: Figure 5 page-by-page iterator. The application of the above design mode shows us that in some special circumstances, the database access model is optimized to meet user needs and improve database access efficiency. This gives us the idea that, when your hardware environment may have bottlenecks, You can optimize the software model to meet your needs. The application scenarios of the above three design modes are as follows: Data Access Object needs to separate commercial logic from data access logic; data source types must be selected during scheduling; the changes in the type of the data source used have no impact on the Data Access of commercial objects or other clients. Fast Lane reader requires frequent read-only access to large column data. It is not critical to access the latest data. Page-by-page iterator accesses large server-side data columns. At any time, you are only interested in a part of the column content. The data of the entire column is not suitable for display on the client; the data of the entire column is not suitable for storage; it takes too much time to transmit the data of the entire column. When displaying the product catalog, we chose the combination of DAO and FLR because the conditions for both of them were met (the commercial logic and data access logic need to be separated, frequently read-only access and is not sensitive to the real-time. In this case, the application will give full play to their advantages. When we perform Content Retrieval, we will choose the PPI because thousands of records may have been retrieved, but users are not interested in reading all the content immediately, but read it 10 times at a time, or after reading the top 10 records, he finds that his goal has been achieved. When he browses other webpages, he does not have to transmit thousands of records to him at a time, therefore, the application conditions of the PPI are met, and the result is that the advantages of this mode are brought into play without affecting global data access. When designing a software model, the overall framework can apply some excellent and general design patterns, which not only speeds up model establishment, but also integrates with other systems. However, when we encounter some bottlenecks, we need to make some adjustments to the local design model to optimize the entire system. The above three models are to supplement the original system, they have not made a huge change to the overall framework, but have broken through some bottlenecks (bottlenecks are often local), so that our products can better serve users. So far, we have discussed how to solve the problem at the software level. However, you must be certain that if your hardware environment is very poor (it is difficult to run Java) or very good (extra storage space, ultra-fast computing speed and rich network bandwidth), the above approach is very difficult for you to greatly help. In the previous case, I suggest you upgrade your hardware to the configuration recommended by the software vendor (strongly opposed to the minimum configuration) so that the application server, database, Java, and other software can run freely; in the latter case, I have nothing to say. Spending money is the best solution to this problem. This article does not talk about the two very important concepts of thread pool and notification buffer, because I believe they are a solution to the bottleneck of High Access traffic in local time, it cannot be understood as a simple speed bottleneck problem, so I will analyze this special situation in the next article and propose a solution to the problem. Maybe you are more concerned about this point and think that your problem lies in it. This is a very good way to think about the problem. You have grasped the key to the problem. However, I suggest you read this article to better understand the speed bottleneck problem and identify the normal and transient problems in the process of solving the problem, select different ideas. In fact, this article is about the speed bottleneck, and the next article will be about the transient state, hoping that you will get better. JDO (Java Data Object) is an API that requires attention. It defines a new data access model and directly draws on the DAO design pattern. Different data sources have different data access technologies, and different APIs are provided for developers. JDO is created to solve this problem. It implements plug-and-play data access and persistent information (including enterprise data and locally stored data) java-centered view. Therefore, developers only focus on creating classes that implement business logic and data that use them to represent data sources. The ing between these classes and data sources is done by experts in the EIS field. If you are interested in JDO, I will write the third article to introduce it to you in detail and provide example applications.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.