MSSQL Best Practices · Using filegroups to isolate backup solutions for hot and cold data

Source: Internet
Author: User
Tags filegroup mssql readable

Summary: In the SQL Server Backup topic sharing, the first four issues we shared: Three common database backups, the development of backup strategies, how to find the backup chain, and the relationship between the three recovery models of the database and the backup. In this monthly report, we share how SQL Server uses filegroup technology to implement a database-to-hot data isolation backup scheme.

Summary

In the SQL Server Backup topic sharing, the first four issues we shared: Three common database backups, the development of backup strategies, how to find the backup chain, and the relationship between the three recovery models of the database and the backup. In this monthly report, we share how SQL Server uses filegroup technology to implement a database-to-hot data isolation backup scheme.

Scene Introduction

Assuming that a company has a very important database (over 10TB), it faces the following scenario:

This database has been stored for nearly 10 years of user payment information (payment), which is very important

The annual data archive is stored in the chronology, and the data in the chronology is read-only and not written (historical payment information is no longer modified), only the current chronology data is read and written

Each time the database is fully prepared takes too long, more than 20 hours, and the database restore operation takes longer, more than 30 hours

How to optimize the design of this database and backup recovery system, can make backup, restore more efficient?

File Group Introduction

The detailed description of the filegroup is not the focus of this share, but as the core technology described in this article, it is necessary to introduce the advantages, creation, and use of a file group in SQL Server.

Advantages of using Filegroups

SQL Server supports the storage of tables and index data to non-primary filegroups, which provides the following benefits when a database has multiple filegroups:

    • Decentralized I/O pressure to different filegroups, the disk pressure can be dispersed if the files of different filegroups are located on different disks.
    • DBCC CHECKFILEGROUP operations are performed on different filegroups, and the same database can be processed in parallel by multiple processes, reducing big data maintenance time.
    • Backup and restore operations can be done for the filegroup level, with finer granularity for controlling backup and restore policies.

Create a filegroup When you create a database

We can create a filegroup directly when we create the database, with the following code:

USE masterGOEXEC sys.xp_create_subdir ‘C:\SQLServer\Data\‘EXEC sys.xp_create_subdir ‘C:\SQLServer\Logs\‘CREATE DATABASE [TestFG] ON  PRIMARY ( NAME = N‘TestFG‘, FILENAME = N‘C:\SQLServer\Data\TestFG.mdf‘ , SIZE = 5MB ,FILEGROWTH = 50MB ),  FILEGROUP [FG2010] ( NAME = N‘FG2010‘, FILENAME = N‘C:\SQLServer\Data\FG2010.ndf‘ , SIZE = 5MB ,FILEGROWTH = 50MB ),  FILEGROUP [FG2011] ( NAME = N‘FG2011‘, FILENAME = N‘C:\SQLServer\Data\FG2011.ndf‘ , SIZE = 5MB ,FILEGROWTH = 50MB ),  FILEGROUP [FG2012] ( NAME = N‘FG2012‘, FILENAME = N‘C:\SQLServer\Data\FG2012.ndf‘ , SIZE = 5MB ,FILEGROWTH = 50MB ) LOG ON ( NAME = N‘TestFG_log‘, FILENAME = N‘C:\SQLServer\Logs\TestFG_log.ldf‘ , SIZE = 5MB , FILEGROWTH = 50MB)GO

Note: To ensure the load-balancing capability of the database filegroup I/O, keep the initial size and autogrow parameters of all files consistent to ensure that the polling scheduling allocation algorithm works correctly.

Create groups individually

If the database already exists, we also have the ability to add filegroups with the following code:

--Add filegroup FG2013USE masterGOALTER DATABASE [TestFG] ADD FILEGROUP [FG2013];-- Add data file to FG2013ALTER DATABASE [TestFG]ADD FILE (NAME = FG2013, SIZE = 5MB , FILEGROWTH = 50MB ,FILENAME = N‘C:\SQLServer\Data\FG2013.ndf‘)TO FILEGROUP [FG2013]GOUSE [TestFG]GOSELECT * FROM sys.filegroups

The final filegroup information is shown below:

Working with Filegroups

After the filegroups have been created, we can put the tables and indexes into the corresponding filegroups. For example, put the clustered index into the primary filegroup, and the table and index data to the FG2010 filegroup, with the following code:

USE [TestFG]GOCREATE TABLE [dbo].[Orders_2010](    [OrderID] [int] IDENTITY(1,1) NOT NULL,    [OrderDate] [datetime] NULL,    CONSTRAINT [PK_Orders_2010] PRIMARY KEY CLUSTERED     (        [OrderID] ASC    ) ON [PRIMARY]) ON [FG2010]GOCREATE NONCLUSTERED INDEX IX_OrderDateON [dbo].[Orders_2010] (OrderDate)ON [FG2010];

Program Design

After the basic knowledge point of the filegroup has been introduced, based on the contents of the scenario introduction, we will use the SQL Server filegroup technology to realize the design of the cold and hot data isolation backup scheme is described below.

Design analysis

Because the payment database is too large, more than 10TB, a single full-time backup of more than 20 hours, if you follow a regular full backup, will cause the backup file is too large, time-consuming, or even because the backup operation on the I/O capability of the consumption impact on the normal business. We will find that, although the database itself is very large, but because only the current chronology of data will be constantly changing (hot data), historical chronology data will not be modified (cold data), so there is really data change operation of the data volume is relatively small compared to the entire library. So, we put the database into the history chronology data on the Read only filegroup, and put the current chronology data on the read write filegroup, The backup system only needs to back up the filegroups where the primary and current chronology are located (of course, for the first time a full backup of the database is required). This can greatly save the backup of the I/O capability consumption, and the realization of the cold and hot data isolation backup operation, but also to disperse the file I/O pressure, and finally achieve the database design and backup system optimization purposes, can be described as an arrow and many carved.

The above text analysis, drawing a beautiful design drawing out, the visual display is as follows:

Design Diagram Description

The following is a detailed description of the design diagram to provide a more intuitive and in-depth understanding of the design plan. The entire database consists of 13 files, including:

    • 1 PRIMARY Filegroup (Primary file group): The user holds database system tables, views, and other object information, filegroups can be read and writable.
    • 10 user-defined read-only filegroups (user-defined read Only file group): Data for historical chronology and corresponding index data, each year of data is stored in a filegroup.
    • 1 user-defined read-write filegroups (user-defined read Write file group): For storing current chronology data and corresponding index data, the table data must be readable and writable, so filegroups must be readable and writable.
    • 1 database transaction log files: for the database transaction log, we need to back up the database transaction log regularly.

Solution implementation

After the design project is complete, the next step is the collective implementation of the scheme, the specific implementation includes:

    • Create a database
    • Create chronology
    • File Group settings
    • Hot and cold backup implementation

Create a database

While creating the database, we created the primary filegroup and the 2008 to 2017 filegroup, which requires a special reminder to ensure that the initial size and growth of the files in all filegroups are the same, the code is as follows:

Use mastergoexec sys.xp_create_subdir ' C:\DATA\Payment\Data\ ' EXEC sys.xp_create_subdir ' C:\DATA\Payment\Log\ ' Create DATABASE [Payment] on PRIMARY (NAME = n ' Payment ', FILENAME = N ' C:\DATA\Payment\Data\Payment.mdf ', SIZE = 5MB, filegrowt H = 50MB), FILEGROUP [FGPayment2008] (NAME = n ' FGPayment2008 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2008.ndf ', SI ZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2009] (NAME = n ' FGPayment2009 ', FILENAME = N ' C:\DATA\Payment\Data\Pay Ment_2009.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2010] (NAME = n ' FGPayment2010 ', FILENAME = N ' C:\DA  Ta\payment\data\payment_2010.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2011] (NAME = N ' FGPayment2011 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2011.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2012] (NAME = N ' FGPayment2012 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2012.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGP AYMENT2013] (NAME = N ' Fgpayment2013 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2013.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2014] (NAME = n ' FGPayment2014 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2014.ndf ', SIZE = 5MB, filegrowth = 50MB), Filegro  Up [FGPayment2015] (NAME = n ' FGPayment2015 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2015.ndf ', SIZE = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2016] (NAME = n ' FGPayment2016 ', FILENAME = N ' C:\DATA\Payment\Data\Payment_2016.ndf ', SIZ E = 5MB, filegrowth = 50MB), FILEGROUP [FGPayment2017] (NAME = n ' FGPayment2017 ', FILENAME = N ' C:\DATA\Payment\Data\Paym Ent_2017.ndf ', SIZE = 5MB, filegrowth = 50MB) LOG on (NAME = n ' payment_log ', FILENAME = N ' C:\DATA\Payment\Log\Payment_l Og.ldf ', SIZE = 5MB, filegrowth = 50MB) GO

Considering that we are adding new filegroups to the database each year, the 2018 filegroups are created individually as follows:

--Add filegroup FGPayment2018USE masterGOALTER DATABASE [Payment] ADD FILEGROUP [FGPayment2018];-- Add data file to FGPayment2018ALTER DATABASE [Payment]ADD FILE (NAME = FGPayment2018, SIZE = 5MB , FILEGROWTH = 50MB ,FILENAME = N‘C:\DATA\Payment\Data\Payment_2018.ndf‘)TO FILEGROUP [FGPayment2018]GO

Finally, confirm the database filegroup information again with the following code:

USE [Payment]GOSELECT file_name = mf.name, filegroup_name = fg.name, mf.physical_name,mf.size,mf.growth FROM sys.master_files AS mf    INNER JOIN sys.filegroups as fg    ON mf.data_space_id = fg.data_space_idWHERE mf.database_id = db_id(‘Payment‘)ORDER BY mf.type;

The results appear as shown:

Create chronology

Once the database and the corresponding filegroups have been created, we create the corresponding chronology and insert some test data, as follows:

Use [Payment]gocreate TABLE [dbo]. [payment_2008] ([payment_id] [bigint] IDENTITY (12008,100) not NULL, [OrderID] [bigint] NOT NULL, CONSTRAINT [pk_payment_2008] P Rimary KEY CLUSTERED ([payment_id] ASC) on [FGPayment2008]) on [fgpayment2008]gocreate nonclustered INDEX Ix_orderidon [dbo]. [payment_2008] ([OrderID]) on [FGPayment2008]; CREATE TABLE [dbo]. [payment_2009] ([payment_id] [bigint] IDENTITY (12009,100) not NULL, [OrderID] [bigint] NOT NULL, CONSTRAINT [pk_payment_2009] P Rimary KEY CLUSTERED ([payment_id] ASC) on [FGPayment2009]) on [fgpayment2009]gocreate nonclustered INDEX Ix_orderidon [dbo]. [payment_2009] ([OrderID]) On [fgpayment2009];--2010-2017 of table creation is omitted here, please refer to the above table and index code to supplement the CREATE TABLE [dbo] yourself. [payment_2018] ([payment_id] [bigint] IDENTITY (12018,100) not NULL, [OrderID] [bigint] NOT NULL, CONSTRAINT [pk_payment_2018] P Rimary KEY CLUSTERED ([payment_id] ASC) on [FGPayment2018]) on [Fgpayment2018]gocreate Nonclustered INDEX Ix_orderidon [dbo]. [payment_2018] ([OrderID]) on [FGPayment2018];

Here is a special reminder of two points:

    • Limited to space, 2010-2017 table creation omitted from the table code, please add yourself
    • The payment_id field initial value of each chronology is not the same, so as not to query all payment information The field value is duplicated

Second, we examine the distribution of the filegroups for all chronology as follows:

USE [Payment]GOSELECT table_name = tb.[name], index_name = ix.[name], located_filegroup_name = fg.[name] FROM sys.indexes ix    INNER JOIN sys.filegroups fg    ON ix.data_space_id = fg.data_space_id    INNER JOIN sys.tables tb    ON ix.[object_id] = tb.[object_id] WHERE ix.data_space_id = fg.data_space_idGO

The results of the query are truncated as follows, and we see that all chronology and indexes are distributed to the corresponding filegroups according to our expectations.

Finally, in order to test, we put some data into the corresponding chronology:

USE [Payment]GOSET NOCOUNT ONINSERT INTO [Payment_2008] SELECT 2008;INSERT INTO [Payment_2009] SELECT 2009;--省略掉2010 - 2017,自行补充INSERT INTO [Payment_2018] SELECT 2018;

File Group settings

After the creation of the chronology and the completion of the test data initialization, we then set the file group read and write properties, and the code is as follows:

USE masterGOALTER DATABASE [Payment] MODIFY FILEGROUP [FGPayment2008] READ_ONLY;ALTER DATABASE [Payment] MODIFY FILEGROUP [FGPayment2009] READ_ONLY;--这里省略了2010 - 2017文件组read only属性的设置,请自行补充ALTER DATABASE [Payment] MODIFY FILEGROUP [FGPayment2018] READ_WRITE;

Finally, our filegroup read-write properties are as follows:

USE [Payment]GOSELECT name, is_default, is_read_only FROM sys.filegroupsGO

As follows:

Hot and cold backup implementation

All filegroups are created successfully, and after the read and write properties have been configured, we need to make full-volume, differential, and database-level log backups of the database-readable filegroups, and to facilitate testing, we insert a piece of data between the two backups. The general idea of a backup operation is:

    • First, a one-time full backup of the entire database
    • Second, a periodic full-scale backup of the writable and writable filegroups
    • Next, a periodic differential backup of the writable and writable filegroups
    • Finally, a periodic transaction log backup of the entire database
--take a one time full backup of payment databaseuse [master]; Gobackup DATABASE [Payment] to DISK = N ' C:\DATA\Payment\BACKUP\Payment_20180316_full.bak ' with COMPRESSION, stats=5 ; go--for testing, init one recorduse [Payment]; Goinsert into [dbo]. [payment_2018] SELECT 201801; Go--take a full backup for each writable Filegoup (just backup FGPayment2018 as a example) backup DATABASE [Payment] FI Legroup = ' FGPayment2018 ' to DISK = ' C:\DATA\Payment\BACKUP\Payment_FGPayment2018_20180316_full.bak ' with Compressi On, stats=5; go--for testing, insert one recordinsert into [dbo]. [payment_2018] SELECT 201802; Go--take a differential backup for each writable Filegoup (just backup FGPayment2018 as an example) backup DATABASE [paymen T] FILEGROUP = n ' FGPayment2018 ' to DISK = N ' C:\DATA\Payment\BACKUP\Payment_FGPayment2018_20180316_diff.bak ' with DIF Ferential, COMPRESSION, stats=5; go--for testing, insert one recordinsert into [dbo]. [payment_2018] SELECT 201803; go--Take a Transaction Log backup of database paymentbackup log [payment]to DISK = ' C:\DATA\Payment\BACKUP\Payment_20180316_log.trn '; GO

The advantage of this backup is that we only need to make a full and differential backup of the read-write Filegroups (FGPayment2018) (the system objects are included in the primary, the changes are small, the primary filegroups need to be backed up in the actual scenario), and the other 9 read-only filegroups do not need to be backed up. Because the data doesn't change any more. In this way, we have implemented a cold and hot data isolation backup scheme. The next question is, how do we recover a database from a backup set in the event of a disaster in the payment data, resulting in data loss? We can restore the backup set in the following ways:

    • First, restore a one-time, full-scale backup of the entire database
    • Second, restore all read-write filegroups to the last full-volume backup
    • Next, restore the last differential backup of the writable filegroup
    • Finally, restore all transaction log backups for the entire database
--We Restore full backupuse mastergorestore DATABASE [payment_dev]from disk=n ' C:\DATA\Payment\BACKUP\Payment_20180316 _full.bak ' withmove ' Payment ' to ' C:\DATA\Payment_Dev\Data\Payment_dev.mdf ', MOVE ' FGPayment2008 ' to ' C:\DATA\Payment_ Dev\data\fgpayment2008_dev.ndf ', move ' FGPayment2009 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2009_dev.ndf ', move ' FGPayment2010 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2010_dev.ndf ', MOVE ' FGPayment2011 ' to ' C:\DATA\Payment_Dev\ Data\fgpayment2011_dev.ndf ', move ' FGPayment2012 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2012_dev.ndf ', move ' FGPayment2013 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2013_dev.ndf ', MOVE ' FGPayment2014 ' to ' C:\DATA\Payment_Dev\ Data\fgpayment2014_dev.ndf ', move ' FGPayment2015 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2015_dev.ndf ', move ' FGPayment2016 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2016_dev.ndf ', MOVE ' FGPayment2017 ' to ' C:\DATA\Payment_Dev\ Data\fgpayment2017_dev.ndf ', MOVE ' FGPayment2018 ' to ' C:\DATA\Payment_Dev\Data\FGPayment2018_dev.ndf ', MOVE ' Payment_log ' to ' C:\DATA\Payment_Dev\Log\Payment_dev_log.ldf ', norecovery,stats=5; go--Restore writable Filegroup full Backuprestore DATABASE [Payment_dev] filegroup = n ' FGPayment2018 ' from DISK = N ' C : \data\payment\backup\payment_fgpayment2018_20180316_full.bak ' with norecovery,stats=5; go--Restore writable Filegroup differential Backuprestore DATABASE [Payment_dev] filegroup = N ' FGPayment2018 ' from DI SK = N ' C:\DATA\Payment\BACKUP\Payment_FGPayment2018_20180316_diff.bak ' with norecovery,stats=5; go--Restore payment database transaction log backuprestore log [payment_dev]from DISK = N ' C:\DATA\Payment\BACKUP\\Paymen T_20180316_log.trn ' with NORECOVERY; go--take database oneline to Checkrestore database [Payment_dev] with RECOVERY; GO

Finally check the results of the data restore, according to the test data we inserted, there should be four records.

USE [Payment_Dev]GOSELECT * FROM [dbo].[Payment_2018] WITH(NOLOCK)

Show execution results, there are four result sets, according to our expectations, as follows:

Final summary

This monthly report shares how SQL Server filegroup technology can be used to implement and optimize the cold and hot data isolated backup, while greatly improving the efficiency of database backup and restore, but also provides the load balance of I/O resources, improve and optimize the performance of the whole database.

Original link

Read more about dry goods, please scan the following two-dimensional code:

MSSQL Best Practices · Using filegroups to isolate backup solutions for hot and cold data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.