C # the fastest way to copy large files

Source: Internet
Author: User

As we all know, the copy provided by Microsoft's operating system is "mentally retarded" with low speed and no resumable data transfer. In addition, copying will drag other applications down and occupy a large amount of File Cache. Therefore, many advanced copy tools are born, and FastCopy is the best. The copy speed of FastCopy can basically reach the limit of the disk, and its implementation can be seen because it is open-source. However, it is a pity that his project is VC6, and the source code annotations are in Japanese. Besides, his source code style is confusing. I confirmed my sentence: the highest realm of open-source software is that I am open-source, you do not understand; when you understand, it is outdated.

To achieve the fastest copy speed and reduce memory usage, you must have an understanding of the copy process. Copying is nothing more than a process of reading the file data and then writing it into it. The copy tool provided by the XP operating system first opens the file handle, then reads a piece of data into the cache, and then writes it to the disk. Open "Windows Task Manager", process, view, select a column, enable I/O to read bytes, and write I/O to bytes. When a file is stored, the assumer.exe process can see the entire read/write process. Basically, we can see that XP performs file copying at almost the same time. In other words, its open cache is relatively small, but its efficiency may not be very high. On my 200G Seagate 7200.8 hard drive, the replication speed is around 15 Mb/s. The average read speed of the hard disk is 40 MB/s, and the average write speed is over 35 Mb/s.

Some optimizations have been made to file copying in Vista. Although some bugs make it very slow to copy small files, the idea of copying large files is different from that of XP. Or open the task manager and perform the same operation. It will be found that Vista will read nearly MB and then write the file to the disk. The assumer.exe process will also soar the memory usage to MB at the moment of copying. My computer testing is about MB, and after the replication is complete, the memory usage will return to normal. The status of Vista shows that the replication speed is around 18 Mb/s. Still cannot reach the hard drive speed limit.
Looking at the copy process of Vista and XP, we can draw a conclusion that Vista tries to optimize the copy of the disk, but it does not matter whether it is small copy of XP or large cache replication of Vista, can not reach the fastest speed of the disk.

During the replication of two operating systems, you will find an interesting phenomenon. The system cache value of the "physical memory" type on the "performance" page of the "Task Manager" of XP will increase continuously, and the value will not grow after it reaches a value. The system cache is mainly used to cache the memory of some programs used and the files opened and read/written by the cache, which has reached a faster read/write speed. By default, the CreateFile function of Win32 API uses the system cache for read and write operations. Therefore, you must first cache the files opened with CreateFile. This is also true for explorer, so when you open a large number of background programs and copy a large file, it becomes very slow to open these background programs and the hard disk keeps reading. This is because the File Cache occupies too much memory space and some programs are occupied by the cache, so the background program will become very slow. In Vista, this situation is better. Although such a design can accelerate many file operations, the use of system cache is a waste of resources for one-time operations such as file copying.

I also found that when using FastCopy to copy large files, another phenomenon occurs, that is, the "system cache" will plummet, and the disk read/write speed has basically reached the limit. In XP, the performance of background programs can be improved, because FastCopy does not use read/write operations cached by the operating system, and the software automatically opens a 32 m cache (customizable ). The behavior in Vista is a bit odd, and the "system cache" will also be reduced. However, after the copy is complete, the hard disk will continue to read until the size before the copy is reached. XP does not.

So how can we reach the limit speed? Is cache required or not? How much cache is required for caching? So I did a small experiment.

Using System;
Using System. Collections. Generic;
Using System. Linq;
Using System. Text;
Using System. IO;
Using Microsoft. Win32.SafeHandles;
Using System. Runtime. InteropServices;

Namespace csharp
{
Class Program
{
Public const short FILE_ATTRIBUTE_NORMAL = 0x80;
Public const short INVALID_HANDLE_VALUE =-1;
Public const uint GENERIC_READ = 0x80000000;
Public const uint GENERIC_WRITE = 0x40000000;
Public const uint CREATE_NEW = 1;
Public const uint CREATE_ALWAYS = 2;
Public const uint OPEN_EXISTING = 3;
Public const uint FILE_FLAG_NO_BUFFERING = 0x20000000;
Public const uint FILE_FLAG_WRITE_THROUGH = 0x80000000;
Public const uint file_1__read = 0x00000001;
Public const uint file_pai_write = 0x00000002;

// Use interop to call the CreateFile function.
// For more information about CreateFile,
// See the unmanaged MSDN reference library.
[DllImport ("kernel32.dll", SetLastError = true)]
Static extern SafeFileHandle CreateFile (string lpFileName, uint dwDesiredAccess,
Uint dwShareMode, IntPtr lpSecurityAttributes, uint dwCreationDisposition,
Uint dwFlagsAndAttributes, IntPtr hTemplateFile );

Static void Main (string [] args)
{
Bool useBuffer = false;
SafeFileHandle fr = CreateFile ("d: \ source", GENERIC_READ, file_assist_read, IntPtr. Zero, OPEN_EXISTING, useBuffer? 0: FILE_FLAG_NO_BUFFERING, IntPtr. Zero );
SafeFileHandle fw = CreateFile ("d :\\ dest", GENERIC_WRITE, file_assist_read, IntPtr. Zero, CREATE_ALWAYS, useBuffer? 0 :( FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH), IntPtr. Zero );

Int bufferSize = useBuffer? 1024*1024*32: 1024*1024*32;

FileStream fsr = new FileStream (fr, FileAccess. Read );
FileStream fsw = new FileStream (fw, FileAccess. Write );

BinaryReader br = new BinaryReader (fsr );
BinaryWriter bw = new BinaryWriter (fsw );

Byte [] buffer = new byte [bufferSize];
Int64 len = fsr. Length;
DateTime start = DateTime. Now;
TimeSpan ts;
While (fsr. Position <fsr. Length)
{
Int readCount = br. Read (buffer, 0, bufferSize );
Bw. Write (buffer, 0, readCount );
Ts = DateTime. Now. Subtract (start );
Double speed = (double) fsr. Position/ts. TotalMilliseconds * 1000/(1024*1024 );
Double progress = (double) fsr. Position/len * 100;
Console. WriteLine ("Speed: {0}, Progress: {1}", speed, progress );
}
Br. Close ();
Bw. Close ();
Sw. Close ();
Console. WriteLine ("End ");
Console. ReadLine ();
}
}
}

The idea of the entire program is relatively simple. open the file, read the data to the custom cache, write the data, and close the file .. . NET default FileStream is cache read/write by default, and no parameter is specified for non-Cache read/write. Fortunately, a FileStream constructor can easily pass a handle and use it. When reading the SafeFileHandle of MSDN, you can find the CreateFile usage. You only need to pass a FILE_FLAG_NO_BUFFERING to CreateFile to read and write data without using the system cache. However, non-cached read/write operations are limited. For details, see the relevant documentation of MSDN.

Run the above program to copy a large file on the d disk Of A Segate 80 GB hard disk (the hard disk cache is 8 MB, and its copy speed is about 26 Mb/s, the result is as follows:

Cache size does not use the system cache copy speed (MB/s)
1 Mbps 11.99 n/
2 Mbps 15.19 n/
4 Mbps 20.42 n/
8 M 23.87 n/
16 M 25.09 n/
32 M 25.93 11.31
64 M n/a 15.89
128 MnS/a 17.02

Setting bufferSize to 64 mb in the program causes an exception, so 64 MB has no data. The read Speed of cached copies is less than 32 MB. No more tests are conducted.

Obviously, the copy speed without the system cache is significantly higher than the copy speed with the system cache when the cache size is the same as the custom cache size. When the system cache is not used for copying, the copy speed reaches 90% when the cache size is equal to the disk physical cache size; when the Disk Physical cache is equal to 2 times, the disk access limit is basically reached. This is why FastCopy does not use the system cache.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.