It is well known that the system reads data from memory hundreds of times times faster than it does from the hard disk. So now most of the application system, will maximize the use of caching (in memory, a storage area) to improve the system's operational efficiency. MySQL database is no exception. Here, the author will combine their own work experience, with you to explore the MySQL database Cache management skills: How to properly configure the MySQL database cache, improve cache hit rate.
When will the application get the data from the cache?
When a database reads data from a server, it can retrieve data from the hard disk's data file or read from the database cache. Now what the database administrator needs to figure out is, in what circumstances, is the system reading data from the cache instead of reading it from the hard disk's data file?
Simply put, data caching is a storage area in memory that stores the user's SQL text and related query results. Typically, the next time a user queries, if the SQL text used is the same, and the relevant records have not been updated since the last query, the database takes the contents of the cache directly. From this principle, you can see that if you want to directly use the data in the cache, you must meet at least the following conditions.
The first is that the SQL text used is the same. After the current two users use the same SQL statement (assuming no other conditions are considered), the server reads the result from the cache without having to parse and execute the SQL statement. It's important to note here that the SQL text must be exactly the same at once. If the query is two times before and after, different query criteria are used. If you do not enter a WHERE condition statement on the first query. It was later found that the amount of data was too high, using the Where condition to filter the results of the query. At this point, even if the final query results are the same, the system still gets the data from the data file, not from the data cache. Again, the field names used after the select must be the same. If you have a different field name or a different number of fields used before and after the two query, the system will consider it a different SQL statement and then parse and query again.
Second, from the perspective of data caching, case is not sensitive. If the query is two times, the field names used may differ only by case. If the first use is the size, the second use is lowercase, which is still considered the same SQL statement. or keyword case, etc. this is not sensitive.
The third is to meet the two queries, the data records including the table structure has not been changed. If the label of the record is changed, such as adding a field, and so on, all buffered data systems that use the table are automatically emptied. Notice here that the change here refers to a generalized change, including any data in the table or changes in the result. To give a simple example, the first query users need to query 2010 years of shipping data. After the query, a user inserts a January 2011 shipping message into the table. Then there are users need to inquire 2010 years of shipping information. The SQL statement used is identical to the first query. In this case, does the database system use the data in the cache? The answer is No. Because when an intermediary user inserts a record, the system automatically empties all cached records associated with the table. When the second query, there is no cached information for this table in the cache. You need to parse and query again.
Four is to note that the default character set on the cache hit rate impact. Typically, if the default character set used between the client and the server is different, the system still considers a different query, even if the query statements are the same, and the records and table structures between the two queries are unchanged. This requires special attention, which is easy to overlook.
Second, improve the cache hit rate recommendations.
From the above analysis of the conditions can be seen, the use of data in the cache is more stringent conditions. In fact, these conditions are reasonable. The main purpose is to ensure data consistency. With a deep understanding of these conditions, the database administrator now needs to consider how to increase the hit rate for this cache. The author has the following several suggestions.
The first is that when configured, the client and server end use the same character set. If the client (or Third-party tool) uses a different character set than the server side, caching is not used in any case. Especially in China, we need to use Chinese character set. It is particularly necessary to note that the client default character set is the same as the default character set on the server side. Note that here is the same, not compatible. Sometimes even with different character sets, the client can still display correctly. This is mainly because some character sets, though different, are compatible with each other. In cache management, the need for the same, light compatibility is not enough.
The second is on the client, to solidify the query statement. If there are financial and procurement personnel at the same time from the system to query the November shipments of data. Obviously they have different responsibilities and the content of the required fields is different. On the client side, you can allow the user to format the form they need. However, the author suggests that the SQL statements used in the background should preferably be the same. Here the data passes through three channels: background database, client, user. In the author's consciousness, the interaction between the background database and the client takes the same SQL statement. The client then interacts with the user to display data to the user based on the user-defined format, including the arrangement before and after the field, excluding the differences in the query criteria statements. In this case, the query efficiency of the application system can be improved by using the same SQL statement (only the user has different requirements for the display format).
The third is to improve the memory cache configuration, to improve the hit rate. Typically when the server starts, the operating system negotiates the size of the cache space with the database software. When the cache is not working, the oldest cached record in the cache is overwritten by the latest message. Visible, if you can increase the cache space, you can increase the hit rate. This is like shooting, the goal is many, the chance of hit is also much higher. However, the greater the number of concurrent users, the less obvious the effect of this setting.
Four is through the partition table can increase the cache hit rate. In the above conditional analysis, you can see that as long as a record is inserted in the queried table, the system empties the cached record. Now for example, check shipping records. Shipping records are updated every day, and users will often need to inquire about the previous year's shipping record at the beginning of the week. Now that the data in this table is updated every hour, the information in the cache is constantly being kept. The cache hit rate is obviously not very high at this time. In this case, the author suggests that the partition table can be used. If the system can be set up, the 2010 shipment records are stored separately in a shipment of the partition table. That is, a separate partition table is used for each year. This 2011-year record will not affect the 2010 partition table. At this point if the user repeated query 2010 years of shipping information, as long as the use of the same SQL statement (not using different query conditions), then you can enjoy the benefits of caching mechanism, improve the application system query effect.
The impact of multiple applications on caching.
Typically, the MySQL database cache is automatically allocated based on the size of the server's memory. If there is only one MySQL application on a server, then it is best. However, in the actual work, in order to reduce the cost of information investment, often in the same server to decorate a number of information applications. As other information applications also need to use memory space as a cache, then the MySQL database cache space may become smaller. In this case, the database administrator needs to negotiate with the system engineer to manually set up different cache space for different applications depending on the performance requirements. In this way, you can avoid the same server on the different information applications on the cache conflict.