Performance optimization for back-end Systems (iii) SQL optimization

Source: Internet
Author: User
Tags bulk insert commit dba join thread delete cache oracle database advantage

Yesterday I introduced to you how to find bad code, how to gracefully implement an application of the monitoring program. Of course, after discovering the bad code, we still have to find a way to get rid of it, perhaps it will be very stubborn. Tell me today. A very important part of performance optimization: SQL optimization

What I'm going to say today is not how to write good, performance SQL, these DBAs are more professional than me. In our company, all the sql,dba that DBAs can optimize are digested internally and need to be fed back to us, which means they may also be at their wits ' end. It's time for us to hit the shots.

Insert,update This type of SQL, performance is generally not too slow, I put this one of the possible problems in one example, composed of a scenario: lock a record, insert 10W Pojo record after the status of the update lock record is completed, using the Oracle database, Each record is required to get a sequence to be inserted as a primary key value, and each record will need to obtain a sequence to generate no, not the same as the ID, as follows:

Pojoorder order = Pojoorderdao.lock (Pojoorderid); <br>for (Pojo p:pojolist) {
    Long sequce = Pojodao.gensequce ( );
     P.setid (sequce);<br> Long nosequce = Pojodao.gennosequce ();<br> P.setno (nosequce);<br> PojoDao.sav E (P); <br>}<br>order.setstatus (status.finish); <br>pojoorderdao.save (order);

10W data cycle A single insert in the peak time is very slow, and the same transaction pessimistic lock has been locked a record, locked for too long, resulting in a lock wait, the database connection can not be released, the database load soaring, and even caused the database downtime.

These simple lines of code may not be a problem in business implementation, but from a performance standpoint, it's a few lines of code, and we've made some modifications,

First, you no longer go to the database fetch sequence, write the sequence values in the INSERT statement (this is not the case with hibernate save, use native SQL). Similar to insert into Pojo (id,no) VALUES (seq_pojo.nextval,seq_pojo_no.nextval).

Second, the single submission will be modified to batch every 1000 to submit once.

Again, lock the Pojoorder modify the state of the three-step operation lock record--> BULK INSERT--> Update completed to be divided into three transactions four steps to do, lock--> update into processing--> COMMIT TRANSACTION BULK INSERT (exception to update status to unhandled) COMMIT transaction The Insert completion---updated to complete--> commit TRANSACTION, which is controlled in three transactions. This approach may be like the previous code to ensure atomicity, we sacrifice this consistency to improve performance, reduce risk, as a concurrent volume is also a large Web application is also very worthwhile. And in recent years, there is no such strong consistency does not cause failure, the probability of data inconsistency is very very low.

After the above processing, the performance of the insert has been greatly improved, there is no more frequent lock waiting situation.

Inserts and updates on the problem, we encountered less, the most headache is a number of complex queries, in order to meet the requirements of the business, have to write a number of query statements to obtain the basis of data, this time poor performance of the SQL optimization we need to according to different circumstances to different optimization programs, Here are a few more common scenarios.

1, the Association to change into a single table. There may often be a dictionary table in a database table that describes the meaning of a value, and sometimes there are several dictionary tables in a business table, and queries need to correlate dictionary table attributes rather than dictionary IDs. Such as:

Select m.* from M m inner join c c on m.c_id = C.id inner JOIN b on m.b_id = b.id where C.CW =? and b.px=? and M.status =? and M.create_time > Sysdate-10

B,c Two tables are small table, the amount of data is not large, C.CW and c.id one-to-one relationship, B.PX and b.id one-to-one relationship. The Association of such queries is in fact completely unnecessary. Can be changed directly into

Select m.* from m m where m.c_id =? and m.b_id =? and M.status =? and m.create_time = sysdate-10;

The value of C.CW and c.id and the mapping of b.id and B.PX values can be achieved by simply using the cache. is to b,c both tables of data into the cache, when needed directly from the cache can be taken.

2, execute too frequent query SQL to reduce the frequency of execution. That's the kind of scene that everyone should think about, itself to deal with this scenario is very simple, the use of caching can be, the cache is believed that many students will use, but to take advantage of good caching is also very fastidious, the cache expiration strategy is more prudent, now the author most of the expiration of the use of the strategy has active expiration ( Delete cache operation when data is updated and timed expiration (sets a cache time, automatically expires after the time). Caching is good indeed can greatly reduce the pressure on the database, but with bad, it is likely to cause data dirty reading problem.

It's a good way to mitigate execution frequency with caching, but what happens when you have to meet very frequent readings of real-time data in some key business? Like the next very simple SQL

SELECT * from a WHERE a.id =?

This SQL itself is not much of a performance problem and it is executing very quickly, but the DBA is serious about the need to reduce its frequency of execution. We monitored the SQL and found that it was called at 20K times per minute, and the frequency of calls was very high. After examining the code, many of these simple queries are found in loops. Change to batch query, add cache where you can increase the cache, and reduce the execution frequency of SQL to 2k-5k times per minute.

The DBA also provides us with a way to reduce the number of parse times for a single SQL by adding hint. Here is a more detailed description: http://www.dbafree.net/?p=778

Select/**1*/* from a WHERE a.id =?
Select/**2*/* from a WHERE a.id =?
Select/**3*/* from a WHERE a.id =?
Select/**4*/* from a WHERE a.id =?
Select/**5*/* from a WHERE a.id =?

3, the partition table increases the partition key value condition. This is very well understood, our Oracle data tables are partitioned tables, after adding the partition key value, Oracle can quickly navigate to the area of data lookup to avoid a full table scan.  such as: Select A.id,a.account from a where a.register_time > sysdate-10; If the partition key value of Table A is create_time. If this SQL does not increase the create_time condition, it scans all the extents to find the data for the Register_time > sysdate-10 conditions, and if the condition of the partition value is added, The corresponding partitions are scanned based on the partition key values, and the other partitions are not scanned.

4, the need to remove the select fields. Our actual application, for various reasons, queries a record directly on the SELECT * or all the fields are listed after the query returned. This kind of writing can improve the efficiency of coding to a certain extent, improve the reusability, but in the performance, we do not advocate this kind of writing, for the two field values found 30 fields, OK, you are too green. We have a table with a field value that holds a large string, and then one day we found out that the data was very slow when we looked it up, the DBA didn't have much of a problem checking the index, and then we found that the value of storing this very large string was also detected, and because of the network transmission and so on, the return result was slow, Most abhorrent of all, the business does not use this large string. We then removed the field and some unwanted fields from the select list, and everybody was happy.

5, deadlock, is the system performance of the Dead enemy, of course, to completely avoid it. Before the volume of business is not so high, our system has a deadlock, the specific business scenario is:

The record A in table A is first locked by method M, the pessimistic lock gets the record B in the B table when the B record is locked by another thread's N method, and the N method gets the lock of record a in table A, the a record is locked by the method M, and the two threads wait for the lock to release. This kind of scene, the lock is unable to release.

M method Thread
select * from A where id = 1 for update///This record is locked, the N method's B-table record is also locked to
select * from B where aid = 1 for update Lock
N Method Thread
select * from b where aid = 1 for update//Lock
SELECT * from A where id = 1 for update//wait M Method Thread Release Lock

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/project/

Deadlock scene There are many (two different threads to execute the same method can also cause deadlock), the solution principle of each scenario are the same, do not need to lock the removal of locks, it is unavoidable, adjust the order of the lock to be consistent or to change to a pessimistic lock + retry mechanism. If the above scenario can be changed to the same lock order (note: According to your business to see if it is suitable for yourself)

M method Thread
select * from B where aid = 1 for update
SELECT * FROM A where id = 1 for update
n method Thread
select * FROM B WHERE aid = 1 for update
SELECT * FROM A where id = 1 for update

As for two different threads in the same method deadlock also has, generally in the circular update operation, if the method is called concurrently, and the order of the elements in the collection is different.

for (Pojo p:pojolist) {
    pojodao.update (p);
}
Imagine if the first thread was updated with a record in the first, N recorded in the last, while the other thread was updated at the same time N was recorded in the first and a in the last one.

Of course, many times callers may tell you that there is no possibility of concurrent calls, but from the point of view of the service side, we need to completely avoid the possibility of deadlocks. This situation is sorted by a rule before updating the elements in the collection.

In short, in peacetime work, we will encounter a variety of SQL caused by performance problems, the optimization of the space on the SQL is very small, the means of optimization is also relatively fixed. Optimization after the reaction to the effect is not the same, some effect is good, most of the effect is general, if some of the SQL does not optimize the space and every day by the DBA after the time, may consider a different business processing scheme can be used to avoid the use of this SQL.

When we're done with some requirements, we don't use the database without using the database if we think about performance more. In fact, many developers of the thinking of a more fixed, did not take full advantage of the cache, memory of the consciousness and ideas, think of everything to exist in the database, what is taken from the database is reliable. Perhaps we can also use other methods to achieve the same reliability as the database.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.