Some personal experiences on teradata database Optimization

Source: Internet
Author: User

1. When creating a table, use create Multiset table instead of create set table, especially for large tables. Because set needs to remove duplicate records, the entire table will be searched when there is a large amount of data, high overhead
2. Pay attention to writing Code To avoid the huge cross-product generated by the wrong code.
3. When associating two tables, try to associate them with Pi, so that the entire table is not retrieved. Otherwise, the overhead will be high.
4. Note that when two tables are joined, except that the join conditions must be written after the on clause, other conditions should be written after the WHERE clause as much as possible, because different result sets will be generated after the on clause and where clause are written.
5. When writing the where condition, try to filter out most of the records first and then connect the two tables. When there are many conditions in the WHERE clause, the system will filter the records from right to left in order. Therefore, try to write the conditions that can filter most records to the rightmost (except for association conditions)
6. Try to use exists or not exists instead of in or not in.
7. When selecting distinct and group by, consider: if the result set of distinct occupies most of the original table, select distinct; otherwise, use group, generally, use group by as much as possible.
8. When writing the SELECT statement, try to write the specific field as much as possible, and avoid writing the "select *... ", Because of this writing, the system will convert the" * "number internally, which will increase the system overhead.
9. Assume that the field in table t is of the int type. When writing a condition, try to write the same data type as the original table type after the WHERE clause to avoid internal conversion and increase the overhead. For example:
It is not recommended to write as follows:
"Select... From t where a = '000000 '",
We recommend that you write this statement as follows:
"Select... From t where a = 123"
10. Avoid using nested select statements as little as possible, especially for large tables. Even if you want to use it, try to add filtering conditions in the WHERE clause to minimize the number of records in the result set obtained by the subquery. This will reduce the system overhead consumed by the query and improve the submission efficiency.
11. If a table is retrieved repeatedly and the search conditions are generally the same, we recommend that you create a temporary intermediate table for this search to improve efficiency and try to make the query based on this intermediate table, to reduce repeated query operations on the original table and improve efficiency (the larger the table data volume, the more recommended)
12. Try to use "> =" or "<=" instead of ">" or "<", because the "=" number will help the database accurately locate the record to be retrieved, instead of consuming effort to find the next value greater or smaller than a value. For example, the T table has an M field of the int type, which has three values: 2, 3, and 5,
Not recommended Syntax:
"Select... From t where M> 3 ",
The recommended statement is as follows:
"Select... From t where M> = 5"
13. When union or union all is selected, If You Want To sort or sort the set of the two results, select Union. Otherwise, try to Union all, because Union all does not sort or repeat the two result sets in sequence, it is very simple to directly union the results together without consuming too much system resources. In general, Union all is recommended.
14. In order to improve query efficiency, we should appropriately consider introducing redundancy and writing frequently critical fields into the table to avoid resource loss due to repeated association with a table. In addition to low efficiency
15. Try to use Union instead of or, for example:
It is not recommended to write as follows:
"Select... From t where a =... Or B =... "
We recommend that you write this statement as follows:
"Select... From t where a =... Union select... From t where B =... "
16. When using like, try to like the index field and write it as follows: "like 'abc % '" or "like' % abc'", instead of writing it as follows: "Like '% ABC % '". And try to write like on the left of where. Of course, many times I have to write "%" wildcard for both ends, depending on the specific situation.
17. Try not to use is null or is not null as the condition, because null values do not have indexes and can only be scanned in the whole table.
18. When using update for millions of tables, if the data record of update is greater than 10% of the total number of records, other methods should be considered, such as using Delete + insert
19. When two tables are joined, the field names of the association conditions should be as consistent as possible, and the columns located on the PI should be used whenever possible.
20. When creating a table, if you can determine the width of a string field, try to use char instead of varchar.
21. When querying or joining multiple tables, try to exclude or filter the null field first, or process the null field in the original table before querying or joining.
22. In teradata, data is stored in different ampersands Based on PI values. Therefore, we should try to make the data records in each ampersands evenly distributed, that is, when defining pi, uniqueness or low Repeatability

The above is my personal point of view and summary, which will inevitably lead to omissions. I hope you can correct them more!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.