Load Balancing using the Cache Server

Source: Internet
Author: User

According to the investigation and analysis by some experts, more than 90% of the databases used by enterprises are mainly used for queries. Some enterprises have a higher proportion. That is to say, the user's operations on the database, in fact, the proportion of update operations is very small. Most of the operations are query operations. For example, in some forums, most users only view the post, instead of posting. This is an example of a typical query operation that greatly exceeds the update operation ratio. In this case, the query operation is often the bottleneck of its database performance. How to effectively improve the query performance makes various database experts consider the problem. There is a ready-made solution in the SQL Server database. The database administrator can use the cache server to improve the database performance. I will take SQLServer2008 as an example to talk about how to use the cache server to achieve load balancing and improve the query efficiency of the database.

There is an additional layer between the database server and the WEB application server, that is, the database cache server. In SQL Server databases, these cache servers are used to achieve load balancing at the database level to improve database query performance. So what are the characteristics of this solution? How does one solve the query operation bottleneck? What should I pay attention to when deploying this solution? Don't worry. I will answer these questions one by one.

I. Separate data query and data update

As shown in, if you want to view a post, it will open a connection. In this case, the WEB Application Server queries related records from the background database. Note that the WEB Application Server only reads data from the cache server because it only views the post and does not involve update operations. The records on this cache server are synchronized with those on the database server. Before the WEB Application Server reads data from the database cache server, it will first determine which database server is empty. It first connects to the idle data cache server and then reads data from the server. Therefore, when many users access this forum, this data cache server can achieve load balancing needs.

If you have read a post and need to post a comment, what will happen to the background database? Note: When the WEB application server sends an Update operation, its application server automatically connects to the database server instead of the database cache server. Instead, it directly sends the update statement to the database server. After the database server updates the relevant content, It synchronizes data with the database cache server. It can be seen that the whole data query and data update WEB Application Server are divided into two steps. In fact, this is like driving on a road, where a motor vehicle goes through a motor vehicle road; a non-motor vehicle goes through a non-motor Lane. In this case, the speed of a motor vehicle will not be affected because the non-motor vehicle is slow. In this solution, it is similar to separating database update operations from query operations. When querying, data streams flow in one direction, which can greatly improve the query efficiency. This makes data load balancing more effective. In short, when an application query operation exceeds the update operation, read-only data is cached among multiple databases and the client is evenly connected between databases to distribute the load, then, you can expand the read partitions of the workload to achieve load balancing.

Ii. Notes for adopting this solution

When deploying this solution, some database administrators still need to pay attention to it. As shown in the following figure, the database administrator needs to make adjustments based on the actual situation of the enterprise to improve the value of this solution.

First, you must consider the synchronization frequency between the data cache server and the database server. This synchronization operation is a double-edged sword. If the synchronization frequency is too high, the performance of the database server and the cache server will be affected. If the synchronization frequency is low, the data on the database cache server will not be updated in a timely manner. In this case, the user may not be able to obtain the latest data within a short period of time. Therefore, in general, the system latency should be as short as possible, that is, the database server's update content must be synchronized with the database cache server as soon as possible. Ideally, the database cache server is updated while the database server is updated. However, this is done at the cost of the performance of the database and database cache server. For this reason, the database administrator often does not do this when implementing this solution. It is set to synchronize after a period of time. For example, you can set it to 10 seconds, 60 seconds, 300 seconds, or longer for synchronization. There is no unified standard for the specific synchronization interval. This requires the database administrator to decide based on the enterprise's requirements for data synchronization. Generally, the database administrator can set a relatively long time when the user needs to be satisfied. This can avoid reducing the value of this solution due to excessive synchronization operations. In fact, for most users, the time difference between 60 seconds is acceptable. For example, in a forum, after a person posts a post, he can see that there is no problem after one minute. For people, this one minute is not long. But for the database server, this minute can do a lot of things. Therefore, appropriately increasing the synchronization time can greatly improve the performance of the database server. The cost of this time is sometimes worthwhile.

Second, a direct and fast network connection should be established between the database server and the database cache server. If a large number of users synchronize data between the database server and the database cache server, this may cause a lot of network traffic. Sometimes, when a synchronization operation occurs, the efficiency of this operation may not be the database server or the database cache server itself, but the network connection between them. Because of its available bandwidth and throughput of many database server systems, the efficiency of synchronization operations is affected. Therefore, the network connection between the database server and the database cache server should be as direct as possible. For example, you 'd better not place other unnecessary network devices in the middle, or configure security policies such as firewalls between them. These security policies and network devices will greatly affect the efficiency of this synchronization operation. In addition, it is best not to have other application services to compete for bandwidth. To put it simply, if possible, deploy multiple NICs on the database server, and directly implement dual-host interconnection with the database source server, which transmits the data required for synchronization operations, this is a good method. Because its data transmission is more direct, and other devices or application services will compete for its bandwidth, it can also overcome their illegal attacks. To this end, as long as the distance between them is short, using this solution may be better, and the time required for this synchronization operation can be minimized, so that other users can see the updated data as soon as possible.

3. select an appropriate replication solution for synchronization

How can we synchronize data between the database server and the cache server? There are three solutions available for the database administrator in SQLServer database. These three solutions are snapshot replication, merge replication, and transaction replication. The three replication models have their own characteristics. However, in terms of the final effect, both the database server and the database cache server can be synchronized. However, because of its different internal implementation mechanisms, although the results are the same, there are still differences in performance and other aspects. The principles and features of various replication models belong to the scope of basic knowledge. I will not elaborate too much here. I believe that in the scheme of using this database cache server to achieve load balancing, it is best to adopt the synchronization scheme of transaction replication. Compared with other solutions, transaction logs can meet the requirements of transaction consistency, large throughput of the database server system, as little overhead as possible during synchronization, and short system latencies. In addition, in some enterprises, the filtering requirements for tables and records should be taken into account. Through transaction replication, you can filter columns and rows. Other models can only partially meet these requirements. Therefore, I believe that when selecting a data synchronization scheme, transaction replication may be selected to achieve synchronization, which is more appropriate. However, whether this is true or not, the database administrator is required to perform tests based on the actual needs of the enterprise and then use several replication models to obtain reasonable results.

  1. Google Cache Server is about to drop "Eden"
  2. The cache server saves you 60% of the bandwidth
  3. Applications of the cache server in Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.