In my previous blog, I have described how to install and configure the RBS FILESTREAM Provider in a SharePoint 2010 system to store files from SharePoint in a disk file system. However, when a user uploads a file in SharePoint, the binary content of the file is written to the specified disk folder through the RBS FILESTREAM Provider. RBS can greatly improve the ability of SharePoint storage files and effectively make SharePoint content databases not expand as the number of files grows.
However, when a user deletes a file from a SharePoint site (and the file has been completely removed from the SharePoint Recycle Bin), RBS FILESTREAM provider does not "really" remove the corresponding file from the disk file system. To improve performance, the RBS FILESTREAM provider only records that one file was "deleted," but to actually remove the physical file from the disk file system, you need some extra steps to do the garbage collection work.
The RBS FILESTREAM provider has a built-in command-line maintenance tool that enables "garbage collection." In addition, it can also be a consistency check, data maintenance and many other tasks. But today we're going to focus on how to use this maintenance tool to implement "garbage collection" to completely remove those junk physical files from the disk file system.
When the RBS FILESTREAM provider is installed on the machine, there is a "maintainer" folder in the default installation directory (Program Files\Microsoft SQL Remote Blob Storage 10.50). There are 2 files: Microsoft.Data.SqlRemoteBlobs.Maintainer.exe and Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config, the former is the command-line maintenance tool, the latter Is the configuration file for the Maintenance tool.
Before using the Maintenance tool, you need to open that profile and specify the connection string for the RBS-enabled SharePoint content database in the configuration file. Each content database needs to specify a connection string, respectively. If this is the first time you open a configuration file, you will find that the connection string is saved by default in encrypted mode. Well, I personally think this is really not necessary, because the maintenance tool needs to run directly on the SharePoint server, this is a need for server administrator permissions to do things, so it seems that the server configuration file in the connection string encryption, it is too careful ... Of course, if you are using a hybrid authentication method to connect to the database, it is also okay to encrypt the connection string. The way to encrypt the connection string is to use aspnet_regiis.exe this command-line tool. But in this article, I'll just show you how to save the connection string in clear text.
The following figure is the contents of the configuration file on my machine. Only one connection string is defined inside. The name of the connection string is "Wss_content_connstr", and the connection string is "Data source=sp2010;initial catalog=wss_content;integrated security=true". These need to feel the actual situation in the environment to modify. If you have multiple SharePoint content databases that have RBS enabled, you need to add multiple connection strings to each of them separately, giving each connection string a different name.
The maintenance tool can then be executed at the command line. Enter the following instructions and run:
Microsoft.data.sqlremoteblobs.maintainer.exe-connectionstringname wss_content_connstr-operation Garbagecollection-garbagecollectionphases RDO
The part of the green tag in the directive needs to be identical to the connection string name specified in the configuration file. When the maintenance tool executes, it prints out some textual information, showing how many spam files it collects.
If you execute the above instructions in your own experimental environment, you may find that it does not delete a spam file, and if you repeat it, it will even tell you that it "refuses" to run frequently because of the short interval. This is because each SharePoint content database has 3 more time interval parameters associated with the maintenance tool after the RBS is enabled: "Delete_scan_period", "Orphan_scan_period", and "Garbage_ Collection_time_window, which specify settings such as the minimum allowable scan cycle, the purge garbage file cycle, and so on. These 3 parameters together affect the process of scanning the maintenance tool and purging the garbage files.
In general, there is no need to modify these 3 parameters. In the experimental environment, to test the effect of garbage collection, you can try to modify these 3 parameters. Open SQL Server 2008 Management Studio, select a SharePoint content database, and then execute:
exec mssqlrbs.rbs_sp_set_config_value ' delete_scan_period ', ' Time 00:00:00 '
exec mssqlrbs.rbs_sp_set_config_value ' orphan_scan_period ', ' Time 00:00:00 '
exec mssqlrbs.rbs_sp_set_config_value ' Garbage_collection_time_window ', ' Time 00:00:00 '
The above 3 SQL instructions set the 3 interval parameters to 0. If the maintenance tool finds a spam file, it displays the appropriate information in the information it prints to the screen.
Until now, the entire process described above is the provider of the RBS FILESTREAM garbage collection. But because the RBS FILESTREAM provider uses the FILESTREAM feature in SQL Server 2008, the FILESTREAM component has its own set of management methods for junk files. In other words, at RBS level, RBS will remove junk files through its own garbage collection, but that does not affect the level of FileStream. Even at the RBS level, a "garbage collection" has been completed and a file has been deleted, but at the FileStream level, the file may still not be removed from the disk file system unless the FileStream component makes a "garbage collection" of its own.
The easiest way to force FileStream to "garbage collection" is to execute the following SQL directive on the database:
CHECKPOINT
Finally, for the RBS FILESTREAM Provider Maintenance tool, because it is a command-line tool, you can use the Windows scheduled task to execute it regularly. For FileStream SQL directives, you can execute them periodically through the jobs of SQL Server.
See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/web/sharepoint/