From the programmer's perspective, NTFS 2000: stream and hard link

Source: Internet
Author: User
Tags cairo

Dino esposito
March 2000

Abstract:This article discusses NTFS 2000, a new file system in Microsoft Windows 2000. (19 pages in total)

Directory

1. Introduction
2. NTFS 2000 Overview
3. Multi-file stream
4. Basic principles of stream
5. Stream backup and enumeration
6. Hard Link
7. Enjoy NTFS Functions
8. Summary

 

Introduction

Since 1994, the myth about Microsoft (r) Windows NT (r)'s full object-oriented version has been circulating for some time. Cairo-the legendary OS versionCodeName-never implemented outside Redmond's lab. Since its own Cairo, some of its basic ideas are made public from time to time.

The basic idea behind Cairo is that files and folders should be a collection of objects and objects. Folder content does not have to be limited to the basic file system storage mechanism. You can access and copy objects as independent and independent projects. Files and Folder objects use methods and properties to display programmable APIs. These terms can be either standard or defined by the owner or the author.

What we have today is a file system that registers files and folders in some internal structures. When files and folders move in the disk, they will be copied. There is a set of fixed functions for files and file jigs, which are too few to meet the needs of modern applications.Program. As part of the workspace, several techniques have been provided over the past few years to add additional information to files and folders. Shell and namespace extensions, desktop. ini files, FileSystemObject, and "shell Automatic Object Model" are examples. However, all these features are just a few and partial solutions. They cannot be the basis for organic re-design of Windows file systems. Because forward compatibility is a serious issue, windows still uses legacy file systems built on the file allocation table (FAT), with the birth dates traceable to Microsoft MS-DOS (r) version 2.0! Even if more improvements have been made recently, such as support for high-capacity hard disks, fat is still an inappropriate method for storing file and Folder Information.

Over the past few years of practical experience, we have encountered the most important constraint that the programmer must correctly manage and identify the additional information required for the file. Recently, someone asked me to retrieveActualCreation date. You may think this is a simple task, because the creation date is an attribute that can be easily retrieved through some API functions. This is only partially correct. Try to copy the same Word file on different machines or even in the same folder, and then compare the creation date of the two replicas. The strange thing is that they are different! When copying a file, you create a new file with a time mark indicating when to create it. When you continue to process copies, you lose valuable information about when to start up files.

Fortunately, Word documents areSummaryinformationThe field retains this information. Therefore, in my case, I was able to solve the problem and successfully announced the customer. If it is an access or text file, my efforts will be wasted.

For Windows NT, Microsoft introduces a new file system called NTFS. Among all its notable functions, the B-tree structure is particularly significant, it accelerates File Retrieval in large folders, file-based security, recording, enhanced File System recoverability, and makes better use of disk space than fat or FAT32. (By The Way, Windows 2000 provides full support for and access to FAT32 volumes .)

Since they were used in Windows 3.1, NTFS volumes have another feature that is often ignored: they support multiple data streams flowing into a single file. For Windows 2000, streaming support is enhanced again and some other handy features are added to help you process files seamlessly. Let's take a look at the main functions of NTFS 2000-the NTFS version synchronized with Windows 2000.

NTFS 2000 Overview

If multiple data streams are not the exclusive feature of NTFS 2000 volume files, Windows 2000 is also required. They are:

    • File and directory encryption

    • Disk Quota per user and volume
    • Reanalysis and Hierarchical Storage Management
    • Mount point
    • Hard Link
    • Change Record

During Windows 2000 installation, you are required to specify whether to convert Windows 2000 to NTFS 2000. However, the NTFS 2000 file system is required only when the machine acts as the domain controller. You can convert a FAT partition to NTFS at any time by using the command line utility convert.exe:

 
Convert volume/Fs: NTFS [/V]

The volume parameter specifies the drive letter followed by a colon. It can also be a mount point or a volume name. /Fs: the NTFS option indicates that the volume must be converted to NTFS. Finally, if you want to run the utility in detail mode, use/v. When you run convert.exe, it initializes and requests you to restart. After the restart, the conversion takes effect immediately.

In addition to all the functions listed above, the significant aspect of Windows 2000's entire folder management is the comprehensive and slightly scalable support it provides to the desktop. ini file. In the rest of this article, I will focus mainly on stream and hard link. However, Table 1 provides an overview of other key features of NTFS 2000.

Table 1. Main Functions of NTFS 2000

Multi-file stream

In an NTFS file system, each file can have multiple data streams. It is worth noting that the stream does not have the NTFS 2000 feature, but has existed since Windows NT 3.1. When you read the file content that is located in a non-NTFS Volume (for example, a disk partition on Windows 98), you can only access one data stream. Therefore, you think it is the true and "unique" content of the file. This type of stream is not named and is unique to non-NTFS file systems. However, when you create a file on an NTFS Volume, the situation is different. For more information, see Figure 1.

Figure 1. structure of multi-Stream files

Multi-Stream files are a set of Single-Stream files embedded in the same file system project. They undoubtedly look like unique and basic units. However, they contain a series of independent sub-units that you can create, delete, and modify separately. There are some common programming environments where the stream is more than enough. However, if you plan to use them, remember that once you copy a multi-stream file to a non-NTFS storage device (such as a CD, floppy disk, or non-NTFS disk partition, all unnecessary streams are lost and cannot be recovered. Unfortunately, this compatibility problem makes the stream less popular in practical applications. Stream is an excellent tool for server-side applications designed and limited to run only on NTFS volumes and can be used to build outstanding and creative solutions.

Basic Principles of stream

When you copy a multi-stream file on a non-NTFS Volume, only mainstream files are copied. This means that you have lost extra data because they will not appear again even if you copy the file back to the NTFS disk. Now, if you work on an NTFS machine, let's see how to create a celebrity. In code example 1, you can see the Windows Script Host (wsh) and Microsoft Visual Basic (r) Scripting Edition (VBScript) files, which demonstrate how to read and write streams from NTFS files.

To identify celebrities in a file, follow special naming rules and add a colon at the end of the file name, followed by the stream name. For exampleTest.txtFileVersioninfoStream, you should use the following file name:

 
Test.txt: versioninfo

This file name is used with any Microsoft Win32 (r) API function that manipulate the file. To accessVersioninfoStream content, pass this nameCreatefile ()And then useReadfile ()AndWritefile ()Read and Write as usual. If you want to check whether a specific stream exists in a file, write the name of the file stream as follows and useCreatefile ()Check whether it exists:

Handle hfile = createfile (szfilestreamname, generic_read, 0, null, open_existing, 0, 0); closehandle (hfile); If (hfile = NULL) MessageBox (hwnd, "error ", null, mb_ OK );

To process streaming, you do not have to be a skilled C ++ programmer. You can also use a stream in Visual Basic or even script code, as shown in Sample Code 1. The key factor that makes this transparency possible is that all low-level Win32 API functions, especiallyCreatefile (), Supports stream-based file names on NTFS partitions. If you try to enableTest.txt: versioninfoFile, you will get the error message "file not found. Note that the problem is actually the file system that contains the volume of the file, rather than calling the Windows platform or disk partition type where the application resides. In other words, you can also successfully access the specified celebrities in the shared folder on the NTFS partition through the connected Windows 98 machine. In addition, the colon is not a valid character even for long file names. Therefore, whenCreatefile ()When you encounter a colon in a file name, you will know that it has a special meaning.

As shown in code example 1, you can also use a stream with VBScript because the FileSystemObject object mode is very effective.Createfile ()To open, write, create, and test files. In the sample code, the text file I created contains empty data, 0-length mainstream, and any number of celebrities. Run the demo program and create two streams. You can name themVersioninfoAndVersioninfoex. There is no indication in Windows shell that you can infer that there are multiple streams in a specific file. As shown in figure 2Test.txtFile Format in "Windows Resource Manager.

Figure 2. A file can be 0 in length, but has a celebrity.

SizeThe column only shows the Untitled mainstream size, even inAttributeYou cannot obtain more information about the stream in the dialog box. Only on NTFS volumes, Windows 2000AttributeIn the dialog box, you only have the opportunity to read information about all files, including text files. ClickSummaryTab, and enter, for example, an author name, as shown in 3.

By the way, due to improvements in the shell user interface of Windows 2000, such names can beAuthorColumn. For more information, see on http://msdn.microsoft.com/msdnmag/Msdn magazine.

Figure 3. Additional information about the. txt file on the NTFS Volume

Hi, wait. Although the abstract information is the general data you set for a word or Excel document, it is undoubtedly part of the document itself. Can I combine it with a text file without changing the content of the plain text? Of course. Shell completes it through a stream! After applying those changes, immediately try to copy the file to another non-NTFS partition. The dialog box shown in Step 4 is displayed.

Figure 4. Windows 2000 warning about possible stream data loss.

It turns out that the test.txt file contains a stream with document summary information. When you try to copy a file with additional information to a volume that does not support this file, the system will notice it. In non-NTFS partitions, only the unnamed mainstream is copied, and the rest are abolished. Therefore, if the target file does not match, the stream-based file will hardly be exchanged.

Stream backup and enumeration

Is there a way-one or two API functions-to enumerate all streams owned by a specific file? Yes, yes. But it is not that simple and intuitive. Win32 backup API function (Backupread,BackupwriteCan be used to enumerate the streams in the file. However, they are a little weird to use and look more like a workspace than an effective final solution.

The idea is that when you want to back up a file or the entire folder, you need to package and store all possible information. Therefore, when you need to enumerate the streams in the file,Backupread ()Is your best friend. I will focus on the prototype of this function:

 
Bool backupread (handle hfile, lpbyte lpbuffer, DWORD nnumberofbytestoread, lpdword lpnumberofbytesread, bool babort, bool bprocesssecurity, lpvoid * lpcontext );

For our purpose, you can ignore such aspects as context and security.HfileParameters must be calledCreatefile ()AndLpbufferPointsWin32_stream_idData structure:

 
Typedef struct _ win32_stream_id {DWORD dwstreamid; DWORD dwstreamattributes; large_integer size; DWORD dwstreamnamesize; wchar cstreamname [anysize_array];} win32_stream_id, * lpwin32_stream_id;

The first 20 bytes of this structure represent the title of each stream. Stream name followedDwstreamnamesizeThe field is followed by the name followed by the stream content. Because the traditional file content can be regarded as a stream-although it is an unnamed stream, to enumerate all the streams, you only need to loopBackupreadReturns false. Actually,BackupreadIt should be able to read all information related to the given file or folder:

 
Win32_stream_id Sid; zeromemory (& SID, sizeof (win32_stream_id); DWORD dwstreamheadersize = (lpbyte) & SID. cstreamname-(lpbyte) & SID + Sid. dwstreamnamesize; bcontinue = backupread (hfile, (lpbyte) & SID, dwstreamheadersize, & dwread, false, false, & lpcontext );

The above section is the key code read in the stream title. If the operation is successful, you can try to read the actual name of the stream:

 
Wchar wszstreamname [max_path]; backupread (hfile, (lpbyte) wszstreamname, Sid. dwstreamnamesize, & dwread, false, false, & lpcontext );

Before accessing the next stream, callBackupseek (), Move the backup indicator forward:

 
Backupseek (hfile, Sid. Size. lowpart, Sid. Size. highpart, & dw1, & dw2, & lpcontext );

In most cases, you can regard a stream as a regular file. For example, to delete a stream, you can useDeletefile (). To refresh the stream content, you only need to useReadfile ()AndWritefile (). There is no formal and supported way to move or retry celebrities. In the last part of this article, I will use this code to create a Windows Shell extension dedicated to NTFS 2000 and add new property pages to all files with streaming information. At the same time, let's take a quick look at another feature of NTFS.

Hard Link

Do you know the shortcut? -These small. lnk files are mostly distributed on the desktop and used to reference other content. There is no doubt that shortcuts are a useful feature, but there are also some shortcomings. First, if you direct multiple shortcuts from different folders to the same destination, you actually have multiple copies of the same-fortunately small-file. More importantly, the target object of the shortcut will change over time. It may be moved, deleted, or simply renamed. What is your shortcut? Can they detect and track those changes so they can be updated correctly (automatically? Unfortunately, they cannot. The main reason is that shortcuts are application-level functions. From a system perspective, they are just User-Defined files. When you want to open them, you only need to do some additional work. You may decide to assign shortcuts to other file classes. If this makes sense, you can create a shortcut class with its own extension not. lnk. To complete this taskIsshortcut. Suppose you want to use the. XYZ file as a shortcut. InHkey_classes_rootCreate a. XYZ node to register the file class and point it to another node.Xyzfile. Then add the empty REG_SZ project:

 
Hkey_classes_root \ xyzfile

This completes.

Other operating systems, especially POSIX and OS/2, have similar functions at the system level. Especially OS/2, which is calledShadows. A hard link is a system-level shortcut for a given file. By creating a hard link to an existing file, you neither copy the file nor copy the file-based reference (that is, shortcuts ). Instead, you add the information to its directory items on the NTFS level. The physical file is unfrozen in the original location. In short, you can use two or more names to access the same content!

Hard links prevent you from retaining multiple (except as needed) copies of the same file, so that the system responsible for managing different path names processes a single physical content. This greatly simplifies your work and saves valuable disk space. In addition, as a system-level shortcut, hard links always point to the correct target file-whether you rename or move it. Because the link is stored in the file system, all changes are automatically and transparently applied. It is worth noting that hard links must be created in the same NTFS Volume. For example, you cannot make the hard link on drive C: point to the file on drive D.

To sound familiar, you can think of hard links as the alias of a file. You can use any alias to access the file. Only after all aliases are deleted can the file be deleted. (The alias serves the same purpose as the reference count .) Since hard links are aliases, it is not a problem to synchronize their content.

Createhardlink ()Is an API function used to create hard links. The prototype is as follows:

 
Bool createhardlink (maid, lpsecurity_attributes lpsecurityattributes );

In the oldMindIn the Code contained in this article (see "Windows 2000 for Web developers"Mind(English), July 22, March 1999), I provide a COM object that allows you to create hard links using script code. Code Example 2 shows the VBScript program that uses it to create a hard link to a given file. Although it is easy to find out how many hard links a file has, there is no tool to enumerate all hard links. API functionsGetfileinformationbyhandle ()FilledBy_handle_file_informationStructure, itsNnumberoflinksField to notify you about enumeration. It is a little difficult to enumerate the names of all linked files. Basically, you must scan the entire volume and assign a unique ID to each file trace. When you encounter an existing ID, you can find a hard link to the file. The unique ID of the file is allocated by the system and stored inBy_handle_file_informationOfNfileindexhighAndNfileindexlowField.

Enjoy NTFS Functions

For adding additional information to the file without changing or damaging the original format, and without occupying disk space, the role of the stream is particularly important. Of course, a stream occupies its own space, but "Windows Resource Manager" does not seem to be aware of this. Stream is invisible to "Windows Resource Manager", so although it seems that there is enough available disk space, the actually available disk space has been reduced to a level of danger. You can add additional (invisible) information to any file, including text and executable files.

On the other hand, hard links are outstanding resources for gathering and sharing information. You only have one real information library that can be accessed from different paths. You know, hard links are not a completely new concept for Windows NT technology. Since the launch of Windows NT, there has been a hard link. However, Microsoft does not provide public functions for creating hard links until Windows 2000 is available. Each file has at least one link to its own, soGetfileinformationbyhandleThe number of links greater than zero is always returned. You cannot set hard links to directories, but only files.

Stream and hard link have a common problem, that is, they have limited support from shell. To remedy this problem, I wrote a shell extension to provide information about the stream and hard link of the given file. Figure 5 shows its appearance and perception.

Figure 5. The stream tab displays information about the stream and hard link.

Shell extendedSource codeUseBackupread ()API function enumeration stream. You only need to callDeletefile ()The content of the selected stream can be deleted.Edit streamClick the button to run the script code in code example 1. You can use it to add or update a stream. Similarly,Create a hard linkTo create additional links. All changes are reflected on the user interface only after refreshing. Note that if you delete a hard link (that is, deleting a file), the total number of links will not be updated as long as the deleted file is still in the recycle bin.

Summary

In this article, I briefly introduced NTFS 2000, focusing on its main functions, such as stream and hard link. If you want to learn more about the new features of the Windows 2000 file system, we recommend that you refer to "a file system for the 21st century: previewing the Windows NT 5.0 File System (21st century File System: preview Windows NT 5.0 File System) ", which was created by Jeff Richter and Luis Cabrera in November 1998MSJWritten, (http://www.microsoft.com/msj/1198/ntfs/ntfs.htm )). This article has not covered some notable topics, especially sparse streams and re-analysis points. However, if you are interested in this article, please let us know and we will help you further.

Sample Code 1

Sample Code 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.