20+ best Practices for MySQL _mysql

Source: Internet
Author: User
Tags rand require sql injection stmt first row codeigniter

Database operations are a major bottleneck in today's Web applications. Not only do DBAs (database administrators) need to worry about a variety of performance issues, but programmers are also trying to make accurate structured tables, optimize query performance, and write better code. In this article, I have listed some MySQL optimization techniques for programmers.
Before we begin to learn, I add that you can find a lot of MySQL scripts and utilities on Envato Market.

1. Query caching for optimized queries

Most MySQL servers have query caching capabilities. This is one of the most effective ways to improve performance, which is handled privately by the database engine. When the same query is executed multiple times, the result is extracted directly from the cache, which is fast.

The main problem is that this is too easy for programmers to see, and many of us are easy to overlook. We can actually organize the query cache to perform tasks.

Query cache does not work
$r = mysql_query ("Select username from user WHERE signup_date >= ()");

Query Cache works!
$today = Date ("y-m-d");
$r = mysql_query ("Select username from user WHERE signup_date >= ' $today '");

The reason the query cache is not executed on the first line is the use of the Curdte () feature. This applies to all nondeterministic features, like now () and Rand () ... Because the result of the function return is variable. MySQL decides to disable query caching for queries. All we need to do is to prevent it from happening before the query by adding an extra line of PHP.

2. Explain your select query

Use explain keywords to help you understand how MySQL runs your query. This helps to identify bottlenecks and other problems with the query or table structure.

Explain's query results show which index has been used, how to scan and store, and so on ...

Select a select query (a complex query with connections is better), add a keyword explain to it, so you can use the database directly. The results will be shown in a beautiful table. For example, just like I forgot to add a column index when I performed the connection:

Now it scans only 9 and 16 rows from table 2, not 7883 rows. The rule of thumb is multiplied by the number of columns of all "rows", your query performance will be proportional to the number of results.

3. Use limit 1 to get unique rows

Sometimes when you look up a table, you already know that the result you are looking for is only one line. You may be getting a unique record, or you may just be querying for a record that satisfies the condition of your WHERE clause.

In this case, adding limit 1 to the query condition can improve performance. This way, the database engine stops scanning the record after it finds the first record, rather than traversing the entire table or index.

Do I have any of the users from Alabama?

What does:
$r = mysql_query ("SELECT * from user WHERE state = ' Alabama '");
if (mysql_num_rows ($r) > 0) {
Much better:
$r = mysql_query ("Select 1 from the user WHERE state = ' Alabama ' LIMIT 1");
if (mysql_num_rows ($r) > 0) {

4. Index search Field

Indexes are not just for primary keys or unique keys. If you're going to search through any column in your table, you should index them.

As you can see, this rule also applies to partial string searches such as "last_name like ' a% '". When you search from the beginning of a string, MySQL can use the index of that column.

You should also understand what kind of search can not use a regular index. For example, when searching for a word (for example, "WHERE post_content like '%apple%"), you will not see the benefits of normal indexing. You'd better use MySQL Full-text search or build your own index solution.

5. Index and use the same field type for the connection

If your application contains a number of connection queries, you need to make sure that the connected fields are indexed on both tables. This can affect how MySQL internally optimizes the connection operation.

Also, the fields that are connected need to use the same type. For example, if you use a decimal field to connect an int field of another table, MySQL will not be able to use at least one index. Even character encodings need to use the same character type.

Looking for companies in I state
$r = mysql_query ("Select Company_Name from users left
  JOIN companies on" (user S.state = companies.state)
  WHERE users.id = $user _id ");
Both state columns should is indexed//And they both should be the same
type and character encoding
//or MYSQ L might do full table scans

6. Do not order by RAND ()

At first it was a cool trick to get a lot of rookie programmers into this trap. But you may not know that once you start using it in a query, you create a very scary query bottleneck.

If you really need to sort the results randomly, there's a better way. By adding some extra code, you will be able to prevent bottlenecks that can result when data is exponentially at a number of levels. The key problem is that MySQL must perform the rand () operation (which requires processing power) for each row in the table before sorting, and just give a line.

What does:
$r = mysql_query ("Select username from user order by RAND () LIMIT 1");
Much better:
$r = mysql_query ("SELECT count (*) from user");
$d = Mysql_fetch_row ($r);
$rand = Mt_rand (0, $d [0]-1);
$r = mysql_query ("Select username from user LIMIT $rand, 1");

So pick a random number that is less than the result number and use it as an offset in the limit clause.

7. Avoid using SELECT *

The more data you read from the datasheet, the slower the query operation. It increases the time required for disk operations. In addition, when the database server is separated from the Web server, there will be longer network latency because data must be transferred between servers.

This is a good habit: You always specify the columns you need when you use a SELECT statement.

Not preferred
$r = mysql_query ("SELECT * from user WHERE user_id = 1");
$d = Mysql_fetch_assoc ($r);
echo "Welcome {$d [' username ']}";
$r = mysql_query ("Select username from user WHERE user_id = 1");
$d = Mysql_fetch_assoc ($r);
echo "Welcome {$d [' username ']}";
The differences are significant with bigger result sets

8. Almost always have an ID field

In each data table with ID as primary key, select Auto_increment or int first. You can also preferably use unsigned, because the value cannot be negative.

Even if you have a user table with a unique user name segment, do not use it as a primary key. The varchar field is slower than the primary key (retrieval). By using the internal ID to reference all the user data, your code will be more structured.

Some background operations are done by the MySQL engine itself, which uses the primary key field internally. When the database settings more complex (clusters, partitions, etc. ...) ), it becomes even more important.

One possible exception to this rule is the association table, which is used for multiple pairs of multiple types of associations between the two tables. For example, the Posts_tags table contains two columns: post_id,tag_id, which holds the relationship between the two tables named "POST" and "tags." These tables can have a primary key that contains two ID fields.

9. Use enum more preferentially than varchar

The Enum enumeration type is very fast and compact. Internally they are stored like tinyint, but they can contain and display string values. This makes them perfect candidates for certain areas.

If you have a field that contains only a few different values, use an enum instead of a varchar. For example, it can be a column named "Status" and contains only values such as "active", "inactive", "Pending", "Expired" ...

There is even a way to get "advice" from MySQL itself about how to refactor your datasheet. When you have a varchar field, it actually suggests that you change the column type to an enum. This is done by calling procedure ANALYZE ().

10. Use procedure analyse () to get recommendations

PROCEDURE analyse () will use MySQL to analyze the column structure and the actual data in the table to provide you with some suggestions. It is useful only when there is actual data in the datasheet, because it is important to analyze decisions.

For example, if you create a primary key of type int but not too many rows, MySQL may recommend that you use Mediumint instead. Or if you use the varchar field, if the table has only a few values, you may get a suggestion to convert it to an enum.

You can also do this by clicking the "Suggest Table Structure" link in phpMyAdmin in one of the table views.

Please keep in mind that these are just suggestions. If your data sheets are getting bigger, they may not even be the right advice. As for how to modify the final decision is yours.

11. Use not NULL if available

Unless you have a very important reason to use null values, you should set your column to NOT NULL.

First, ask yourself. Between an empty string value and a null value (the corresponding int field: 0 vs. NULL) is there any difference. If there is no reason to use these two together, then you do not need a null field (do you know that null and empty strings are the same in Oracle?). )。

Null columns require additional space, and they increase the complexity of your comparison statements. Try to avoid them if you can. Of course, I understand some people who may have very important reasons to use null values, which is not always a bad thing.

from MySQL Documentation:

"NULL columns require additional space when they record whether their value is null or not. For example MyISAM table, each null column has an extra bit, aggregated in the nearest byte. "

12. Preprocessing statements

The use of preprocessing statements has many benefits, including higher performance and better security.

Preprocessing statements filter the variables bound to it by default, which is extremely effective in avoiding SQL injection attacks. Of course you can also specify the variables to filter. But these methods are more prone to human error and are more likely to be forgotten by programmers. This can cause problems when using frames or ORM.

Now that we're focused on performance, we should talk about the benefits. When using the same query multiple times in an application, the benefits are particularly noticeable. If you pass a different parameter value to the same prepared statement, MySQL will only parse the statement once.

At the same time, the latest version of MySQL takes a binary form when transmitting prepared statements, which is very obvious and helps to reduce network latency.

There was a time when many programmers avoided using preprocessing statements for an important reason. The reason for this is that they will not be cached by MySQL. However, at some point in version 5.1, the query cache is supported.

To use preprocessing statements in PHP, you can look at Mysqli extensions or using data abstraction layers, such as PDO.

Create a prepared statement
if ($stmt = $mysqli->prepare ("Select username from user WHERE state=?") {
  //bind parameters
  $stmt->bind_param ("s", $state);
  $stmt->execute ();
  Bind result Variables
  $stmt->bind_result ($username);
  Fetch value
  $stmt->fetch ();
  printf ("%s is from%s\n", $username, $state);
  $stmt->close ();

13. No buffer query

Usually when you execute a query from a script, you will need to wait for the query execution to complete before it can continue the task later. You can use a query without buffering to change the situation.

There is a good explanation for the mysql_unbuffered_query () f function in the PHP document:

"Mysql_unbuffered_query () sends SQL query statements to MySQL and does not automatically fetch and buffer the resulting rows as mysql_query (). This saves a lot of memory for queries that produce a large number of result sets, and you can continue to work on the result set as soon as the first row has been retrieved without waiting for the SQL query to be executed. "

However, it has certain limitations. You must read all the rows or call Mysql_free_result () before executing another query. In addition you cannot use Mysql_num_rows () or Mysql_data_seek () on the result set.

14. Use UNSIGNED INT to store IP addresses

Many programmers don't realize that they can use an integer type of field to store an IP address, so they always use VARCHAR (15) Type fields. Using INT requires only 4 bytes of space, and the length of the field is fixed.

You must ensure that the column is of type unsinged INT because the IP address may use every bit of 32-bit unsigned integer data.

In a query, you can use Inet_aton () to convert an IP to an integer, using Inet_ntoa () for the opposite operation. There are similar functions in PHP, Ip2long () and Long2ip ().

$r = "UPDATE users SET IP = Inet_aton (' {$_server[' remote_addr ']} ') WHERE user_id = $user _id";

15. Fixed-length (static) table will be faster

(Translator Note: The length of the table mentioned here actually refers to the length of the header, that is, the size of each piece of data in the table, not the amount of data in the table).

If all the columns in the table are fixed length, then the table is considered "static" or "fixed length". The types of columns that are not fixed include VARCHAR, TEXT, blobs, and so on. Even if the table contains only one of these types of columns, the table is no longer fixed-length, and the MySQL engine handles it in a different way.

Fixed-length tables can improve performance because the MySQL engine is faster to retrieve in records. If you want to read a place in a table, it can directly calculate the position of the row. If the size of the row is not fixed, it needs to be retrieved in the primary key.

They are also easy to cache, and can easily be rebuilt after a crash. But they also occupy more space. For example, if you change the character of a VARCHAR (20) to a char (20) type, it will always occupy 20 bytes, regardless of what is stored in it.

You can use the "vertical partitioning" technique to split the length of a column into another table. To see:

16. Vertical Partitioning

Vertical partitioning is the act of vertically splitting a table structure to optimize it.

Example 1: You may have a user list that contains your home address, which is not a common piece of data. At this point you can choose to split the table and save the address information to another table. This will make your main user table smaller. As you know, the smaller the table, the faster.

Example 2: There is a "last_login" field in the table where the user updates the field each time they log on to the site, and each update causes the cached query data to be emptied. In this case, you can put that field in another table and keep the user list updated to the smallest amount.

But you also need to make sure that you don't often combine two separate tables, or you'll have to endure the performance degradation that comes with this.

17. Split a large delete or INSERT statement

If you need to perform a large delete or insert query on a Web site, be careful not to affect network traffic. When executing a large statement, it locks up the table and stops your Web application.

Apache runs many parallel processes/threads. It is therefore highly efficient to execute scripts. So the server does not expect to open too many connections and processes, which consumes resources, especially memory.

If you lock the table for a long time (such as 30 seconds or longer), in a high-traffic Web site, it can cause processes and queries to accumulate, and processing these processes and queries may take a long time and eventually even make your site crash.

If your maintenance script needs to delete a large number of rows, simply use the limit clause to avoid blocking.

while (1) {
  mysql_query ("DELETE from logs WHERE log_date <= ' 2009-10-01 ' LIMIT 10000");
  if (mysql_affected_rows () = = 0) {
    //done deleting break
  You can even pause a bit
  usleep (50000);

18. The smaller the column the faster

Disk space may be the most important bottleneck for the database engine. For performance, "small" and "tight" help reduce the amount of disk traffic.

The MySQL document has a list that lists the storage space required for various data types.

If the datasheet is expected to have only a small number of rows, there is no need to define the primary key as an INT, which can be replaced with Mediumint, SMALLINT, or even TINYINT. (Translator Note: For date data, you should use date instead of DATETIME if you don't need a time section.)

Make sure that there is a reasonable room for data growth, or else it could result in results like Slashdot: Slashdot changed the primary key of the comment table to INT for data growth, but did not modify the corresponding data type in its parent table, although an ALTER statement could solve the problem. However, at least some business needs to be stopped for three hours.

19. Choose the right storage engine

MySQL has two main storage engines: MyISAM and InnoDB, each with its own pros and cons.

MyISAM applies to a wide range of applications for read requests, but does not apply to situations where there is a large number of write requests. Even if you just want to update a field in a row, it will cause the entire table to be locked, and then until the query completes, no other process can read the data from the table. MyISAM is very fast when calculating queries of this type of SELECT COUNT (*).

InnoDB is a complex storage engine that is slower than MyISAM in most small applications. But it supports row-level locks and has better metrics. It also supports some advanced features, such as transactions.

    • MyISAM Storage Engine

    • InnoDB Storage Engine

20. Use Object-relational mapper (ORM, Object Relational Mapper)

By using the ORM (Object Relational Mapper), you can get a certain performance boost. All the things that ORM can do, manual coding can also be done. But that could mean too much extra work, and a high level of expertise is required.

ORM is known as "deferred loading." This means they get the actual values only when they are needed. But you need to be careful with them, or you might end up creating a lot of micro-queries that can degrade database performance.

ORM can also process multiple query batches into a transaction faster than sending a single query to the database.

My favorite php-orm at the moment is doctrine. I wrote an article on how to install Doctrine and CodeIgniter (install doctrine with CodeIgniter).

21. Use Persistent Connection carefully

Persistent connections mean less cost of rebuilding connections to MySQL. When a persistent connection is created, it remains open until the script finishes running. Because Apache reuses its child processes, the next time the process runs a new script, it reuses the same MySQL connection.

    • Php:mysql_pconnect ()

It looks good in theory. But from my personal (and many others) experience, this feature can cause more trouble. You may have a limited number of connections, memory problems, and so on.

Apache always runs in parallel, and it creates many child processes. This is the main reason why persistent connections do not work well in this environment. Please consult your system administrator before you consider using Mysql_pconnect ().

Original source:  burak guzel    Translation Source: Open source China    

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.