[Turn] Big Data usage optimization method

Source: Internet
Author: User

Respect the knowledge and respect the author. Original link: http://www.thebigdata.cn/JieJueFangAn/14134.html
On the Big Data Web site, it is convenient for future search to save.


   1. Should try to avoidwhereThe null value of the field is judged in the clause, otherwise it will cause the engine to abandon using the index for a full table scan, such as:SelectId fromTwhereNum isNULLyou can set the default value of 0 on NUM, make sure that the NUM column in the table does not have a null value, and then query:SelectId fromTwherenum=0    2. Should try to avoidwhereUse! = or <> in clausesoperator, otherwise it will cause the engine to discard the full table scan using the index.  The optimizer will not be able to determine the number of rows to be fatal by the index, so it needs to search all rows of that table. 3. Should try to avoidwhereclause to join the condition by using or, this will cause the engine to abandon using the index for a full table scan, such as:SelectId fromTwherenum=10 or num= -You can query this:SelectId fromTwherenum=TenUnion AllSelectId fromTwherenum= -    4. In and not in are also used with caution, because in causes the system to not use the index and can only search the data in the table directly. such as:SelectId fromTwhereNum IN (1,2,3for consecutive values, you can use between instead of in:SelectId fromTwhereNum between1and3    5try to avoid searching in indexed character data with a non-letter heading. This also makes the engine unusable with indexes. See the example below: SELECT* FROM T1 WHERE NAME like '%l%' SELECT* FROM T1 WHERE substing (NAME,2,1)=' L ' SELECT* FROM T1 WHERE NAME like ' l%' Even though the name field is indexed, the first two queries are still unable to use the index to speed up the operation, and the engine has to do all the data on the table one by one to complete the task.  The third query can use an index to speed up operations. 6. If necessary, force the query optimizer to use an index, as inwhereThe use of parameters in a clause can also result in a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan:SelectId fromTwhereYou can force the query to use the index instead:SelectId fromT with (index name)where    7. Should try to avoidwhereclause, which causes the engine to discard the use of the index for full-table scanning. such as: SELECT* FROM T1 WHERE f1/2= -should read: SELECT* FROM T1 WHERE f1= -*2SELECT*from RECORD wheresubstring (Card_no,1,4=5378' should read: SELECT*from RECORD WHERE card_no like '5378%' Selectmember_number,first_name,last_name from members Wheredatediff (Yy,datofbirth,getdate ())> +should read: Selectmember_number,first_name,last_name from Wheredateofbirth that is, any action on a column will result in a table scan, which includes database functions, calculation expressions, and so on.  Whenever possible, move the action to the right of the equals sign. 8you should try to avoid function operations on the fields in the WHERE clause, which will cause the engine to discard the full table scan using the index. such as:SelectId fromTwhereSUBSTRING (name,1,3)='ABC'-name ID that starts with ABCSelectId fromT Wheredatediff (Day,createdate,'2005-11-30')=0--‘2005- One- -' The generated ID should read:SelectId fromTwhereName like'abc%'    SelectId fromTwherecreatedate>='2005-11-30'and createdate<'2005-12-1'    9. When using an indexed field as a condition, if the index is a composite index, you must use the first field in the index as a condition to guarantee that the system uses the index, otherwise the index will not be used, and the field order should be consistent with the index order as much as possible. Tenmany times using exists is a good choice: Selectnum fromAwhereNuminch(Select num fromb) Replace with the following statement: Select Num fromAwhereExistsSelect 1 fromBwherenum=a.num) SELECT SUM (T1. C1) from T1 WHERE ((SELECT COUNT (*) from T2 WHERE t2.c2=t1.c2>0) SELECT SUM (T1. C1) from T1where EXISTS (SELECT*from T2 WHERE T2. C2=T1. C2) both produce the same result, but the latter is obviously more efficient than the former.  Because the latter does not produce a large number of locked table scans or index scans. If you want to verify that there is a record in the table, do not use COUNT (*is inefficient and wastes server resources. Can be replaced with exists. such as: if (SELECT COUNT (*) from table_name WHERE column_name='XXX') can be written as: IF EXISTS (SELECT* FROM table_name WHERE COLUMN_NAME ='XXX'It is often necessary to write a t_sql statement that compares a parent result set and a child result set to find out if there are records in the parent result set that are not in the child result set, such as: SELECT A.hdr_key from Hdr_tbl a----Tbla indicates that TBL uses alias a instead of WHERE not EXISTS (SELECT* from dtl_tbl b WHERE a.hdr_key =B.hdr_key) SELECT A.hdr_key from Hdr_tbla left joins Dtl_tbl b on A.hdr_key=B.hdr_key where B.hdr_key is the NULL SELECT hdr_key from Hdr_tbl WHERE Hdr_key isn't in (SELECTHDR_KEYFROMDTL_TBL) three notation  Can get the same correct results, but the efficiency is reduced in turn. Oneavoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources. Atemporary tables are not unusable, and they can be used appropriately to make certain routines more efficient, for example, when you need to repeatedly reference a dataset in a large table or a common table.  However, for one-time events, it is best to use an export table. -. When you create a new temporary table, if you insert a large amount of data at once, you can use theSelectinto instead of CREATE table, to avoid causing a lot of log to increase speed, if the amount of data is not large, in order to mitigate the resources of the system table, create table first, then insert. -If a temporary table is used, be sure to explicitly delete all temporary tables at the end of the stored procedure, TRUNCATE table first, and then drop table, which avoids longer locking of the system tables. theSet NOCOUNT on at the beginning of all stored procedures and triggers, set NOCOUNT OFF at the end.  You do not need to send a DONE_IN_PROC message to the client after each statement that executes the stored procedure and trigger. -try to avoid large transaction operations and improve system concurrency. -try to avoid returning large amounts of data to the client, if the amount of data is too large, you should consider whether the corresponding requirements are reasonable, such as whether to return the full object, can reduce some nonessential fields. -. Avoid using incompatible data types. For example, float and int, char and varchar, binary, and varbinary are incompatible. Incompatible data types may make the optimizer unable to perform some optimizations that could otherwise have been performed. Example: SELECT name from employee WHERE salary>60000In this statement, such as the salary field is a money type, it is difficult for the optimizer to optimize it because 60000 is an integer number.  We should convert an integer into a coin type when programming, rather than wait for a run-time conversion. +To make full use of the join condition, in some cases, there may be more than one connection condition between the two tables, at which point the connection condition is fully written in the WHERE clause, which can greatly improve the query speed. Example: Selectsum (A.amount) from account A, CARD B WHERE a.card_no=b.card_no selectsum (a.amount) from account A,card B WHERE a.card_no=b.card_no and a.account_no=B.account_no The second sentence will be much faster than the first sentence. -. Use a view-accelerated query to sort a subset of tables and create views, sometimes speeding up queries.  It helps to avoid multiple sorting operations, and in other ways simplifies the work of the optimizer. +. You can use distinct without GROUP by Selectorderidfrom Details Whereunitprice>TenGROUP Byorderid can be changed to: SELECT distinctorderidfrom Details whereunitprice>Ten    A. You can use UNION ALL instead of union UnionAll do not execute the SELECT DISTINCT function, which reduces a lot of unnecessary resources at. Try not to use the SELECT INTO statement. The SELECT inot statement causes the table to lock and prevent other users from accessing the table.

[Turn] Big Data usage optimization method

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.