SQL questions and answers: Line overflow, differential backup, and more

Source: Internet
Author: User

I recently upgraded an application so that it can run on SQL Server 2005. I used the allow row length exceeding 8,060 bytes feature so that users can create long data fields without receiving errors returned from SQL Server. Now, after the application is applied to the actual environment, some scanning queries begin to have performance problems, and these queries run normally before the architecture changes. I have also checked the fragmentation of various indexes, and everything is normal. Why is the query speed slow on SQL Server 2005?

A: The "Row overflow" function that you use is good for allowing a row with a length greater than 8,060 bytes in a specific situation, but it is not suitable for most rows with a large length, in addition, the query performance may be greatly reduced, as you may encounter.

The reason for this is that when the length of a row begins to become too large, one of the variable length columns in the row will be "rolled out ". This means that the column will be moved from the row to the text page on the data or index page. As for the value in the original column, it will be replaced by a pointer pointing to the new position of the value in the column in the data file.

This mechanism is identical with that used to store regular LOB large objects such as XML, text, images, or varchar (max) columns. Note that if the table architecture contains multiple variable-length columns, the same column cannot be introduced when the length of multiple rows is too large.

This mechanism may cause performance problems. If the row has been introduced for a variable-length Column Retrieved from a table row, you may suddenly need additional I/O to read the text page containing values outside the row. If the length of multiple rows is too large, retrieving the same variable-length columns from multiple rows may cause unexpected performance problems, the severity is determined by the number of pushed rows.

In your situation, performing a range scan or table scan query on the selected list containing variable length columns results in performance degradation due to row overflow and its impact. This is irrelevant to whether the index has been fully fragmented. When a Variable Length Column is introduced, because random I/O must be used to read the text page containing values outside the row, therefore, the effective scanning job has been basically interrupted.

Although row overflow is still useful for excessively long rows under certain conditions, if the query performance is crucial, it should not be overly exploited in your design.

Q: We just introduced a database image between two failover clusters as a way to obtain geographic redundancy at a lower cost than the storage area network (SAN) replication. Because the data center is located in the same city, we can use a synchronous image. The problem is that when a Failover occurs on the local cluster, the mirror database will fail over to the remote cluster, which is not what we want to happen. How can we avoid this situation? We only want to perform failover when the local cluster is unavailable.

A: To improve availability, a witness server is installed in the image to automatically fail over when the main server is unavailable. The theoretical basis is: if the entire local cluster fails, the database image will fail over to the second cluster, so that the application can continue to run.

This problem occurs during cluster failover. Failover takes more time than the default timeout settings for database images. The witness Server and the mirror Server are the SQL Server instances active on the second cluster, therefore, the backup storage fails over to the second cluster.

The easiest way to prevent this is to delete the witness server so that the database image will not automatically fail over when a local cluster fails. Of course, this approach will reduce availability, because it requires manual failover.

The second method is to change the default timeout settings of the database image, that is, the number of failures that it responds to "ping" messages per second before the change determines that the master server is unavailable. This setting is called "Parnter Timeout" and the default value is 10. You can use the following code to find the current timeout value of the database:

 
 
  1. SELECT mirroring_connection_timeout  
  2.   FROM master.sys.database_mirroring   
  3.   WHERE database_id = DB_ID ('mydbname');  
  4. GO 

Use the following code to change the timeout value:

 
 
  1. ALTER DATABASE mydbname   
  2.   SET PARTNER TIMEOUT <timeoutvalue>;  
  3. GO 

In this case, the partner timeout value must be greater than the normal time value for cluster failover on the local cluster. It may be difficult to determine the time required for recovery during cluster failover on the image database, but you can determine the upper limit. The disadvantage of this method is that the timeout value may be measured in minutes and is not suitable for real disaster scenarios.

I was asked about the backup policies I used, including full backup and log backup, but someone suggested that I add differential backup to shorten the restoration time. I perform a full backup every week and a log backup every hour. I tried to add differential backups every day, but I noticed an exception: the differential backups at the end of each week are about the same size as the full backups every week. I remember that differential backup and log backup both belong to Incremental backup! Is it wrong for me?
A: This is due to a misunderstanding of the nature of differential backup. Differential backup is different from log backup, not Incremental backup. Differential backup includes the range of all changed data files since the last full backup. This applies to database, file group, and file-level backup ).

Any changes to a logical group that contains eight consecutive data file pages are marked on a special bitmap page called a differential graph. Each 4 GB of data file has a difference diagram. During differential backup, the backup subsystem scans all the differences and copies all the changed ranges, but does not reset the differences. This indicates that the larger the range of changes between consecutive differential backups, the larger the latter's backup. The difference diagram is reset only when the full backup is performed.

If the application workload is so large that the database content is greatly changed in a short period of time, the weekly full backup size is almost the same as the differential backup size before the next full backup. This also explains what you see.

In addition, differential backup does provide a way to shorten the restoration time in case of disaster recovery. If you adopt a full backup every week and a log backup every hour, you must perform the following operations to restore the data most quickly:

Run the last log to back up all logs generated after the latest log backup ).

Restore the latest full database backup.

Restore all log backups after the latest full database backup in sequence.

Restore the last log backup.

A large number of log backups may need to be restored, especially when a disaster occurs just before the next full backup. The worst case is to restore 24 + 24 + 24 + 24 + 24 + 24 + 23 log backups !) In this policy, differential backups are added on a daily basis. The order of restoration is as follows:

Run the last log to back up all logs generated after the latest log backup ).

Restore the latest full database backup.

Restore the latest differential backup.

Restore all log backups after the latest differential backup in sequence.

Restore the last log backup.

In this way, you do not need to restore a large number of log backups, because the restoration differential backup and restoration differential backup cover all log backups during the period are basically the same.

When differential backup is performed every day, even on the last day of the week, the worst case is only 23 log backups. Differential backup is not an incremental backup. One drawback of differential backup is that it may occupy more space, but it is worthwhile compared with shortening the restoration time.

I have a Failover cluster with two nodes. Each node runs an SQL Server 2005 instance. As usual, I set each instance to use only 50% of the available memory. Now I have encountered some problems, because the workload on both instances requires more memory to maintain the same performance level. If I delete the memory limit or increase the memory, I think I will encounter this problem: one of the instances is failover, and then both instances are only running on one node. What are your suggestions?

A: I will answer this question for dual-node, dual-instance scenarios, but the following content also applies to setting N-1 failover clusters for other multi-instance scenarios, N nodes and N-1 SQL Server instances ).

Many people encounter high workloads on both instances, occupying more than 50% of the server memory ), the impact of two instances running on one node on the workload is not considered. If there is no special configuration, the memory allocation between instances is likely to be out of proportion. As a result, one workload runs normally, while the other does not.

For SQL Server 2000, we recommend that you limit each instance to a maximum of 50% cluster node memory. This is because the memory manager in SQL Server 2000 does not respond to insufficient memory-If SQL Server occupies 80% of the node's memory, it does not reduce the memory usage. This means that in case of failover, the other instance is only available with 20% of the memory. By limiting the two instances to a maximum of 50% of the node memory, you can ensure that each failover instance has 50% of the memory. However, the problem with this method is that the workload on each instance is limited to 50% of the memory.

For SQL Server 2005 and SQL Server 2008), The Memory Manager can respond to insufficient memory, so the upper limit of 50% is no longer applicable. However, there is no such restriction. If both instances are running on one cluster node, they may compete for memory until a memory allocation is generated out of proportion.

The answer is to set each instance to the lowest memory volume, so that they will not be forced to release too much memory. For dual-node and dual-node instances, the most common setting is to configure at least 40% of memory for each instance. This means that each instance can occupy any amount of memory when running on different nodes. In case of failover, each instance will have a specific amount of memory to maintain a fixed workload performance level, and some instances will be shared with each other. Although this means that the performance of the two workloads may drop to the expected level when a Failover occurs), most of the time each instance runs on a different cluster node is not limited.

Original article address

Source: Microsoft TechNet Chinese site

Edit recommendations]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.