Optimization query of Big Data scale

Source: Internet
Author: User

1: Index, the first thing we think of is to create an index, creating an index can multiply the efficiency of the query, save time. But if the amount of data is too large, this time simply to create an index is useless, we know that if the statistical query especially in the large data volume, take 1000W data, if you use the Count function, at least 50-100 seconds or more, of course, if your server configuration is high enough, Processing is fast enough, maybe a lot less, but it will take more than 10 seconds.
Simply building an index is useless. We can add a property to the index when we create the index, compress, this property can make a good collation of the index created, so that the query speed will increase 5-10 times, or higher. However, the only disadvantage is that the compressed index can only be created manually, for those keys can not be compressed, because the key (primary key) is an automatically created index, compress required properties, the general default is not to create. So when creating a compressed index, you can find other key fields to compress, such as the serial number in the Work Order table
2: Use as few functions as possible, such as is null;is not null,in; No in such a matching function, you can use the symbol program to operate
3: As little as possible to use subqueries, if you write a class, inside the effect of imitating sub-query, you will find that, simply, we can use a federated query, or an outer join query, so the speed is much faster than the subquery.
4: When using the index, note the following:
In the WHERE clause there is "! = "Will invalidate the index
Select account_name from Test where amount! = 0 (not used)
Select account_name from test where amount > 0 (used)

Add handler function to field in where condition will not use index of this column
SELECT * from emp where TO_CHAR (hire_date, ' yyyymmdd ') = ' 20080411 ' (not used)
SELECT * from emp where hire_date = To_char (' 20080411 ', ' YYYYMMDD ') (used)

Avoid using is null and is not NULL on indexed columns
SELECT * from EMP where dept_code are NOT null (unused)
SELECT * from emp where Dept_code > 0 (used)

Use of the wildcard character%
SELECT * from emp where name like '%A ' (do not use index)
SELECT * from emp where name like ' A% ' (using index)

Optimization query of Big Data scale

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.