Data is explosive growth in the Internet era. We often need to store structured data and unstructured data (such as documents, presentations, videos, audios, and images) together. There are usually several solutions:
1. Store structured data in the database, store unstructured data in the file system, and then there is a field in the database to record the path of the file system, although this method is cost-effective, however, it introduces extra complexity because you need to manually ensure the integrity of cross-link and non-link system management transactions.
2. Structured Data and unstructured data are stored in the database. For many years, the database has supported the storage of Non-relational data, such as binary large objects or BLOB. SQL Server is called the varbinary data type, the advantage is that the database features (such as transaction support, backup and recovery support, integrated security support, and full-text search support) can be fully utilized, but the cost is higher, more disk space is required, because it is stored in the same row of data, and the storage and retrieval time is longer, it also has a negative impact on the overall performance of the application. (If you use select * from xxx to write a query, it will be slow)
3. The FILESTREAM data type introduced in SQL Server 2008 stores unstructured data, such as documents, presentations, videos, audios, images, and databases. It stores a pointer on the file system. In SQL Server 2008, the new FILESTREAM feature is implemented on the existing varbinary (max) data type, you can store real data on the server's file system, but you can manage and access the data in the database context. The transaction integrity is automatically guaranteed.
4. The FileTable of SQL Server 2012 is further enhanced. It allows applications to integrate their storage and data management components by introducing FileTable to allow non-transactional access, provides integrated full-text search and semantic search for unstructured data and metadata.
The following describes the two new features in detail.
Learn about FileStream in SQL Server 2008
The FILESTREAM data type is implemented as the varbinary (max) column. The data is stored in the NTFS file system, and the database stores pointers. In this case, storage is no longer limited by the size of BLOB 2 GB, but restricted by the file size of the NTFS file system. FileStream is disabled by default, so you need to specify the FILESTREAM attribute for the varbinary (max) column. In this way, SQL Server will not store BLOB to the SQL Server database, but to the NTFS file system.
Storing BLOB data in the NTFS file system brings some benefits:
The same performance as directly operating the NTFS file system's data streams
FILESTREAM data does not use the SQL Server Buffer Pool. Therefore, query and processing of the SQL Server Buffer Pool is not affected by FILESTREAM data.
There is no limit on the size of BLOB 2 GB.
Transaction consistency.
Security Model integrated by SQLServer.
During backup and recovery, filestream blob data is included.
Full-text search is supported.
Using SELECT, INSERT, UPDATE, and DELETE statements to operate tables with FILESTREAM data, the performance is not slow.
To use this new feature, you must first enable it at the instance level (during installation, or by modifying the properties of the SQL Server instance, or by using sp_configure ), create or modify a database. There is a file group with the FileStream attribute. Then, create a strap with the varbinary (max) data type column and specify the FileStream attribute.
SQL Server 2012 FileTable
FileTable is further enhanced based on the use of FILESTREAM. It allows direct, transactional, and access to large data in FILESTREAM columns. The FileTable table can also be configured to allow non-transactional access to files without prior SQLServer authorization.
FileTable is a special type of table. Its structure is fixed. Unlike normal user tables that can define their own fields, it is like a file system in a folder. A FileTable contains FileStream data with several file-level attributes (file_id, name, path, creation date, modification date, and last access time), file and directory hierarchies. This means that each row of data in the FileTable represents a file or directory on a file system.
FileTable represents an indows shared directory in the file system, which contains files and directory data. You can access it through non-transactional files (Windows API-based applications can access files, without SQL server access permissions ). For Windows applications, this looks like a file and directory in a normal network shared location. Applications can use Windows APIs to manage files and directories in the shared location of the network.
Windows API operations are non-transactional and irrelevant to databases. However, FileTable is implemented based on FileStream, so SQL transactions are supported. FileTable can also be queried and updated using the normal Transact-SQL command. They also integrate SQL Server management tools and functions, such as backup and recovery.
We need to configure FileTable and FILESTREAM separately. This means that we can continue to use only the FileStream function without enabling non-transactional access or creating FileTable.
Start using FileTable of SQL Server 2012
Enable FileStream
Copy codeThe Code is as follows:
USE master GO
EXEC sp_configure 'filestream access level', 2
Go
RECONFIGURE GO
-- You can use this statement to see current
-- Config value and running value
EXEC sp_configure filestream_access_level;
GO
Press Ctrl + C to copy the code to create the LearnFileTable database.
Copy codeThe Code is as follows:
USE master
GO
If exists (SELECT name FROM sys. databases WHERE name = n' learnfiletable ')
Drop database LearnFileTable
GO
Create database LearnFileTable
-- Details of primary file group
ON PRIMARY
(NAME = LearnFileTable_Primary,
FILENAME = n'd: \ FileTable \ LearnFileTable_Data.mdf ',
SIZE = 10 MB,
MAXSIZE = 50 MB,
FILEGROWTH = 5 MB ),
-- Details of additional filegroup to be used to store data
FILEGROUP rule roup
(NAME = LearnFileTable_Data,
FILENAME = n'd: \ FileTable \ LearnFileTable_Data.ndf ',
SIZE = 10 MB,
MAXSIZE = 50 MB,
FILEGROWTH = 5 MB ),
-- Details of special filegroup to be used to store FILESTREAM data
FILEGROUP fs0000roup CONTAINS FILESTREAM
(NAME = FileStream,
-- FILENAME refers to the path and not to the actual file name. It
-- Creates a folder which contains a filestream. hdr file and
-- Also a folder $ FSLOG folder as depicted in image below
FILENAME = n'd: \ FileTable \ FSData ')
-- Details of log file
LOG ON
(Name = LearnFileTable_Log,
FILENAME = 'd: \ FileTable \ LearnFileTable_Log.ldf ',
SIZE = 5 MB,
MAXSIZE = 25 MB,
FILEGROWTH = 5 MB
)
With filestream (NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = n' learnfiletable ')
-- Other option for NON_TRANSACTED_ACCESS is READ_ONLY or OFF
GO
Press Ctrl + C to copy the code to check FileStream/FileTable
Copy codeThe Code is as follows:
-- Check the Filestream/FileTable Options
SELECT DB_NAME (database_id), non_transacted_access, non_transacted_access_desc, directory_name FROM sys. database_filestream_options
WHERE DB_NAME (database_id) = 'learnfiletable'
Create a FileTable
Copy codeThe Code is as follows:
USE LearnFileTable
GO
Create table MyFirstFileTable AS FileTable
WITH
(
FileTable_Directory = 'myfirstfiletable ',
FileTable_Collate_Filename = database_default
);
GO
Press Ctrl + C to copy the Code. If FileStream is not enabled before FileTable is created, an error is returned.
Msg 1969, Level 16, State 1, Line 1 Default FILESTREAM filegroup is not available in database '<database_name>' after it is created, we can query it and there are no records
USE LearnFileTable SELECT * FROM [dbo]. [MyFirstFileTable]
Select FileTable in Enterprise Manager, right-click "Explorer FileTable Directory" link as shown below:
You will see the network shared directory. manually add several files to this directory. We will return to the SQLServer Enterprise Manager to see what happened:
Run the query statement again.
USE LearnFileTable SELECT * FROM [dbo]. [MyFirstFileTable]
Summary
In this article, I talked about using SQL Server's FileStream and FileTable functions to store unstructured data. The FILESTREAM function stores unstructured data in the file system and stores file pointers in the database. The FileTable further extends this function to allow non-transactional access (access to files, without prior authorization, share location ). In other words, with this function, we can manage unstructured data through the file system, instead of managing it on SQL Server, but still can access these files through transactions on SQL Server.