C # Io Operations (4) Copying large files (using file streams) and file encoding

Source: Internet
Author: User

Large file copy (file stream usage), file encoding

First, let's talk about copying large files and file streams. Because the computer's memory resources are limited, we need to use a program to copy several GB or even larger files, you need to use the file stream (because we cannot load the file into the memory at a time; in fact, the memory does not allow this), so the memory stream in C # appears. Let's take a look at the following content,The commonly used method for reading files in the file class loads all the file content into the memory at a time.:

1 string Spath = @ "C: \ Users \ Chens-PC \ Desktop \ nginx.txt"; 2 // The readallbytes () method of the file class reads the text content to the memory at one time, then return the result in byte array (close the file at the same time ). 3 byte [] btedata = file. readallbytes (Spath); 4 // The readalllines () method of the file class reads the text content row by row to the memory at a time, and then returns the result in a string array (close the file at the same time ), its overload can specify the encoding used for reading. 5 string [] strdata = file. readalllines (Spath); 6 // The readalltext () method of the file class will read the text content to the memory at a time, and then return it as a string (close the file at the same time ), its overload can specify the encoding used for reading. 7 string sdata = file. readalltext (Spath );

Note: The above code, when passing the parameter is encoding. Default (indicating the same as the current operating system code, generally simplified Chinese ANSI), and DOTNET new text file encoding as a UTF-8.

Before copying large files, let's talk about the following:Use file stream(Filestream ):

1 # region write a file through a file stream 2 string smsg = "wind blew a painting of childhood. You stayed with me there, and the bamboo forest was our bamboo, and the flowers you gave me looked up. see you smile so innocent "; 3 using (filestream fswrite = new filestream ("msg.txt", filemode. create, fileaccess. write) 4 {5 byte [] btedata = system. text. encoding. utf8.getbytes (smsg); 6 fswrite. write (btedata, 0, btedata. length); 7 fswrite. flush (); 8} 9 console. writeline ("the file has been written! "); 10 # endregion
1 # region reads a file through the file stream 2 using (filestream fsread = new filestream ("msg.txt", filemode. open, fileaccess. read) 3 {4 byte [] btedata = new byte [fsread. length]; 5 fsread. read (btedata, 0, btedata. length); 6 // encoding must be specified only when the byte [] array is converted to a string. 7 string S = system. Text. encoding. utf8.getstring (btedata); 8 console. writeline (s); 9} 10 # endregion

1. Large file copy:

Idea: considering that copying large files requires both reading and writing, we need to create two file streams (one read and one write). The code for this case is as follows (winform is also applicable ):

1 # region large file copy 2 string ssource = @ "F: \ System \ cn_windows_7_ultimate_with_sp1_x86_dvd_618763.iso"; 3 string starget = @ "E: \ C # exercise day \ windows_7.iso "; 4 5 copybigfile (ssource, starget); 6 console. readkey (); 7 # endregion
View code
1 Private Static void copybigfile (string ssource, string starget) 2 {3 using (filestream fsread = new filestream (ssource, filemode. open, fileaccess. read) 4 {5 using (filestream fswrite = new filestream (starget, filemode. create, fileaccess. write) 6 {7 // defines the buffer 8 byte [] btedata = new byte [12*1024*1024]; 9 int r = fsread. read (btedata, 0, btedata. length); 10 while (r> 0) 11 {12 fswrite. write (btedata, 0, R); 13 double D = 100 * (fswrite. position/(double) fsread. length); 14 console. writeline ("{0} %", d); 15 r = fsread. read (btedata, 0, btedata. length); 16} 17 console. writeline ("copying large files succeeded"); 18} 19} 20}
View code

Note:

How to quickly create a file stream: (do not create a file stream in the loop, put it outside the loop, efficiency issues)
Filestream fsread = file. openread (Spath); // you can quickly create a read stream.
Filestream fswrite = file. openwrite (Spath); // you can quickly create a write stream.
Filestream FS = file. Open (Spath, filemode. openorcreate); // create a file stream

The above three methods are only Microsoft's (encapsulation of the method used to create a file stream in large file copy to simplify code writing ).

Finally, we will discuss the problem of file read/write garbled characters (file encoding ):

Common text file encoding is as follows:

1. gb2312 encoding: compatible with ASCII code table. English characters are expressed in 1 byte (positive number in bytes), and Chinese characters are expressed in 2 bytes (negative number in bytes ).
2. GBK encoding: compatible with gb2312 encoding. Chinese characters are expressed in 2 bytes (1st bytes are expressed in negative numbers, and 2nd bytes are followed by positive numbers ).
3. Unicode: international code table. Both Chinese and English characters are in two bytes.
4. UTF-8: international code table, English occupies 1 byte, Chinese occupies 3 byte.
Because the formats of various encoding and storage data are inconsistent, garbled characters may occur if the read/write encoding is different. The solution to garbled code is to ensure consistent encoding during read/write operations..

Bom header of file encoding:

The file is often encoded when the file is read. Sometimes there is an extra question mark before it, which is caused by the BOM header. readallbytes (). (inaccurate. Some codes contain BOM headers, while others do not)

2. streamreader and streamwriter:
Application Scenario: when using filestream to copy large files, because Chinese occupies 2 bytes, it is possible to read half of Chinese characters and generate garbled characters.
Streamreader reads text files row by row. Readline (), endofstream. Note: Specify the encoding!

Readtoend is used to read from the current position until the end. If the content is large, it will occupy the memory. Each call goes down and cannot be called twice.

Readline reads a row. If it reaches the end, null is returned.


Streamwriter writes text files row by row.

Writeline: writes the string to the current row in the file.

3. compressed stream gzipstream:

1> compression: 1. create a file to read the stream. openread () 2. create a write stream file. openwrite (); 3. create a compressed stream new gzipstream (); write the stream as the parameter and. 4. read part of data each time by reading the stream and write data by compressing the stream.
1 // gzipstream is another packaging of filestream 2 // compress the character file 1.txt 3 // 1. create a stream 4 using (filestream fsread = file. openread ("1.txt") 5 {6 // 2. create Stream 7 using (filestream fswrite = file. openwrite ("yasuo.txt") 8 {9 // 3. create a compressed stream 10 using (gzipstream zipstream = new gzipstream (fswrite, compressionmode. compress) 11 {12 // 4. read 1024byte13 byte [] byts = new byte [1024] each time; 14 int Len = 0; 15 while (LEN = fsread. read (byts, 0, byts. length)> 0) 16 {17 // write the file 18 zipstream through the compressed stream. write (byts, 0, Len); 19} 20} 21} 22} 23 console. writeline ("OK"); 24 console. readkey ();

2> decompress:

1. create read stream: file. openread () 2. create a compressed stream: New gzipstream (); read the stream as the parameter 3. create a write stream file. openwrite (); 4. each time data is read through a compressed stream, data is written into the stream.
 1  using (FileStream fsRead = File.OpenRead("yasuo.txt")) 2             { 3  4                 using (GZipStream gzipStream = new GZipStream(fsRead, CompressionMode.Decompress)) 5                 { 6                     using (FileStream fsWrite = File.OpenWrite("jieya.txt")) 7                     { 8  9                         byte[] byts = new byte[1024];10                         int len = 0;11                         while ((len = gzipStream.Read(byts, 0, byts.Length)) > 0)12                         {13                             fsWrite.Write(byts, 0, len);14                         }15 16                     }17                 }18             }19             Console.WriteLine("ok");20             Console.ReadKey();

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.