How does mysql optimize tens of millions of database tables?

Source: Internet
Author: User

2. data items: whether there are large fields and whether the values of those fields are updated frequently; 3. SQL condition for Data Query: the column names of which data items frequently appear in the WHERE, GROUP BY, and ORDER BY clauses. 4. SQL condition for data UPDATE: the number of columns that frequently appear in the WHERE clause of UPDATE or DELETE;

1. Data capacity: the total number of data records and the total number of bytes of each data record within 1-3 years;

2. Data items: whether there are large fields and whether the values of those fields are updated frequently;
3. SQL condition for Data Query: the column names of which data items frequently appear in the WHERE, GROUP BY, and ORDER BY clauses;
4. Data update SQL conditions: the number of columns that frequently appear in the WHERE clause of UPDATE or DELETE;
5. The statistical ratio of the SQL volume, for example, SELECT: UPDATE + DELETE: INSERT =?

6. What is the average daily execution volume of large tables and associated SQL statements?
7. Data in the Table: Update-oriented business or query-oriented business
8. What database tutorial is used for physical servers and database server architecture?
9. How is concurrency?
10. Does the storage engine choose InnoDB or MyISAM?

I have a general understanding of the above 10 questions. Everything should be clear about how to design such a large table!

If optimization refers to a created table and cannot change the table structure, we recommend that you use the InnoDB engine to reduce disk I/O load by using more memory points, because I/O is often the bottleneck of the database server.

In addition, if you want to optimize the index structure to solve performance problems, we recommend that you modify SQL statements to make them faster. You have to rely only on the index structure. Of course, the premise is that,
The index has been created very well. If the index is read-oriented, you can enable query_cache,

And adjust some parameter values: sort_buffer_size, read_buffer_size, read_rnd_buffer_size, join_buffer_size

Other suggestions:

1. indexing, avoiding scanning, and searching based on the primary key, hundreds of millions of data is also very fast;
2. The anti-paradigm design changes the space for time to avoid join. Some join operations can be implemented using code without the need to use databases;


1. Only the required data is returned.

To return data to the client, you must at least extract data from the database, transmit data over the network, receive data from the client, and process data from the client. If no data is returned, it will increase invalid labor on servers, networks, and clients. The harm is obvious. To avoid such incidents, you must note:

A. in the horizontal view, do not write the SELECT * statement, but SELECT the fields you need.

B. Vertically, write the WHERE clause reasonably. Do not write SQL statements without WHERE.

C. Pay attention to the WHERE clause after select into. Because select into inserts data INTO the temporary table, this process locks some system tables. If the data returned by this WHERE clause is too large or the speed is too slow, the system table will be locked for a long time and other processes will be blocked.

D. For aggregate queries, you can use the HAVING clause to further limit the returned rows.

  

2. Try to do less repetitive work

This is the same as the above, that is, to minimize ineffective work. However, the focus of this point is on the client program. Note the following:

A. control multiple executions of the same statement, especially the multiple executions of some basic data.

B. Data conversion may be designed to reduce the number of data conversions, but it can be done by programmers.

C. eliminate unnecessary subqueries and connection tables. subqueries are generally interpreted as external connections in the execution plan, resulting in additional costs for redundant connection tables.

D. Merge multiple updates for the same table with the same condition, for example

Update employee set fname = 'haiwer 'WHERE EMP_ID = 'vpa30890f'

Update employee set lname = 'yang' WHERE EMP_ID = 'vpa30890f'


These two statements should be merged into the next statement.

Update employee set fname = 'haiwer ', LNAME = 'yang'
WHERE EMP_ID = 'vpa30890f'
E. Do not split the UPDATE operation into the DELETE operation + INSERT operation. Although the functions are the same, the performance difference is great.

F. Do not write meaningless queries, such as: SELECT * from employee where 1 = 2

  

3. Pay attention to transactions and locks

Transactions are important tools in database applications. They have four attributes: atomicity, consistency, isolation, and persistence. We need to use transactions to ensure data correctness for many operations. When using transactions, we need to avoid deadlocks and minimize blocking. Pay special attention to the following aspects:

A. The transaction operation process should be as small as possible, and the transactions that can be split should be split.

B. There should be no interaction in the transaction operation process, because the transaction is not finished while the interaction is waiting, and many resources may be locked.

C. Access objects in the same order during transaction operations.

D. Improve the efficiency of each statement in the transaction. Using indexes and other methods to improve the efficiency of each statement can effectively reduce the execution time of the entire transaction.

E. Try not to specify the lock type and index. SQL SERVER allows us to specify the lock type and index used by statements. However, in general, the lock type and index selected by the SQL SERVER optimizer are optimal in terms of the current data volume and query conditions. What we specify is only available in the current situation, however, the data volume and data distribution will change in the future.

F. You can use a lower isolation level when querying a report, especially when querying a report. You can select the lowest isolation level (uncommitted read ).

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.