MIS performance optimization frequently asked questions and solutions (summary stickers for auxiliary project group performance optimization)

Source: Internet
Author: User
Tags import database

Recently help the company's several project team carried out different aspects of performance optimization, found that several projects have some common problems. Write an article here summarizing these types of questions and their corresponding solutions. Convenient for other project group reference.

Frequently asked Questions: Open the page very slowly, some projects open a page unexpectedly more than 20 seconds.

Optimization steps:

    1. Reduce the number of requests per page: Use the browser to track all requests after the page has been opened, and to troubleshoot the need to optimize the request to the server and reduce the number of Round trips.
    2. Optimize for each request: Check the requests one by one to see which requests are taking up more time.
      If the request is a JS file, consider using a compressed version (such as the ExtJS you are using, you should use Ext-all.js 1.9M instead of Ext-all-debug.js 4.5M).
      Static resources to enable caching as much as possible.
    3. Minimize the number of database accesses that correspond to each request: This step is back-end optimized.
      After each request arrives at the server, it will do a series of operations, such as initializing the current user, roles, permissions, current module, business logic, logs, and so on.
      For the first four operations, which are often required for all pages to be initialized, we need to use techniques such as Session or Cache to optimize to prevent each request from re-accessing the database.
      For business logic, logs, and so on, we mainly solve the "n+1 problem in ORM", optimize the redundant database access (remember, every time the database access, is actually a remote access).
    4. In addition, the Web page of the front-end optimization, you can also refer to the "YAHOO Web optimization 14 rules."

FAQ Two: Slow execution of single SQL statements

Some SQL statements perform very slowly in the case of large amounts of data.

Optimization steps:

    1. Is there a performance problem with SQL itself?
    2. is the table partition established?
      Http://www.cnblogs.com/leiOOlei/archive/2012/06/08/2541306.html
      http://blog.csdn.net/hijiankang/article/details/9173877
    3. Are the fields of the primary query indexed?
    4. is the test data valid? (try to prepare test data according to real-world scenarios)
    5. Do you need to limit the user's range of data queries?
    6. Do you need to optimize your business structure?
      Do you really need to give the user a page that looks at hundreds of thousands of pages of data? Such data is often useless to the user. Should we redesign the functional modules?
    7. Instead of using JOIN, use the in statement.
    8. Do not query the full field, but query the ID only.

For example, the following SQL, in the stress test 10 million rows of data, it takes about 8 seconds:

select  * from  ( Select  t.*, ROWNUM RN from  (select  *f ROM  "T_primitivedetail" "T0" where  "T0". " Ordergoodsdate ">=:p 0 and " T0 "." Ordergoodsdate "<=:p 1 and " T0 "." CORPORATION "=:p 2 and " T0 "." Dbi_isphantom "=:p 3order  by " T0 "." ID "asc ) T where  ROWNUM <= 5000010) where< /span> RN >= 5000000--parameters : 2016/5/1 0:00:00,2016/5/31 23:59:59, "Huizhou Cool Friends Network Technology Co., Ltd.", "0 "

Both the ID column and the ordergoodsdate column are indexed, and table partitioning is established for the Ordergoodsdate column. After several tests, it was found that querying by index column sort was slow (Index Id column: First 5 seconds, followed by 2.3 seconds, indexed time column: 6 seconds; not sorted: 2 seconds).

At the same time, we also test paged SQL. There are currently three more common paging formats:

1. According to rowID to divide
SELECT * from T_xiaoxi where rowID in (
Select Rid from (
Select RowNum Rn,rid from (
Select rowID rid,cid from
T_xiaoxi ORDER BY CID DESC
) Where rownum<10000
) Where rn>9980
)
ORDER BY CID Desc;
2. According to the analysis function to divide
SELECT * FROM (
Select T.*,row_number () over (order BY CID Desc) rk from T_xiaoxi t
) where rk<10000 and rk>9980;
3. Press RowNum to divide
SELECT * FROM (
Select T.*,rownum rn from (
SELECT * FROM T_xiaoxi ORDER BY CID DESC
) T where rownum<10000
) Where rn>9980;

The results show that the original paging format, the efficiency is the highest. The following SQL query takes 4 seconds:

SELECT * FROM (
Select T.*,row_number () over (order by ID ASC) RK from T_enterprisetransaction t
) where RK <= 5000010 and RK >= 5000000

Since our paging SQL is automatically generated, we want to keep some commonality in the format, and the performance is not too bad. So we decided to keep the original format.

The previous optimization scenarios did not succeed. We just looked at the 1000 data that the tester inserted. The original time of the data is the same day!!! Partition and index failures are caused. The data according to the real scene input, less than 1s query out.

In addition, 6th: pages with 500,000 pages of data should not be designed for customers to see. So I let the project team with this consider whether it is necessary to delete this page, in exchange for an implementation scenario.

8th, do not check the whole field, only check ID: After testing, but also has a more obvious effect.

select  ID from  ( Select  t.*, ROWNUM RN from  (select  id From  "T_enterprisetransaction" "T0"--order  by  T0. Id desc --order  by  T0. Dbi_createdtime--order  by  t0.tradedate) T W Here  ROWNUM <= 5000010) where  RN >= 5000000--0.6 sec select  * from  "t_enterprisetransaction" "T0" where  ID in  (7853679,7853680,7853681,7853682,7853683,7853684,7853685,7853686,7853687,7853688,7853689)-- 0.1 seconds 

It takes 0.7 seconds altogether.

FAQ Three: Big Data import performance optimization

An important module of the company's products is a data import engine. Based on the WF4 engine, with a certain activity, to implement from file to database import. That is: Read the file –> a large number of data format conversion logic & a large number of business logic –> Import database.

Because of the complexity of the logic, we did not put these logic in the database to write the stored procedures, but instead based on the domain entities in memory to execute the business logic.

For the optimization steps for this program:

    1. Through the Performance monitoring tool, find the core of performance loss, and then optimize for the location out of the scheme.
      This step should be taken as a first step. Developers in the performance optimization, often "take for granted" to analyze, optimize the behavior, and ultimately spent time and did not optimize to the point! So here's the first step forward. Let the tools help us find these core locations first!
    2. Reduce the number of database accesses. (This check is the primary optimization scenario for all database accessors and is the most prone to problems.)
      1.1 Solve the n+1 problem in ORM.
      1.2 Once the data is ready in memory, it is inserted into the database once.
      1.3 For importing large amounts of data into a database, a bulk import scenario is used instead of an article-by-article approach.
    3. Multithreading technology.
      Because data importers are IO-intensive + CPU-intensive operations, they run at different stages. Therefore, the rational use of multi-threading, can greatly improve the efficiency of execution.
      When using multi-threading, pay attention to the problem of thread safety: Try not to have too many shared resources (files, rows in the database), and shared resources to be locked (file locking, Memory locking, database transactions (level of transaction)).
      In addition, some common data is prepared for each thread in advance, and some unnecessary IO can be optimized as well.
    4. Optimize the loop of the core big data and the code in the core loop of the nested loop. The code for these locations needs to be carefully tuned to the extreme.
      In the core loop, do not use LINQ to object: A LINQ to Object operation that generates at least three lightweight objects: a delegate, an object that implements the IEnumerable interface, and a enumerator that is generated when the collection is traversed.
      In the core loop, try to minimize the number of cycles. For example: 10 million data and 1 million data do a loop match, the non-optimized loop needs to execute 10 million * 1 million times. So we need to introduce some algorithms to optimize the number of unnecessary cycles.
      Only the core loops need to be optimized and all the code needs not be optimized. LINQ to Object is used when it is used.

Summary

This paper summarizes the common problems encountered by several projects in the company.

Hope to be helpful to other project groups. We also hope to collect more suggestions for optimization.

MIS performance optimization frequently asked questions and solutions (summary stickers for auxiliary project group performance optimization)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.