20th IO Library
The I/O library provides two modes for file operations. The simple model has a current input file and a current output file, and provides actions related to those files. The complete model is implemented using an external file handle. It defines all file operations as a method of file handles in the form of a face object. Simple mode is more appropriate when doing some simple file operation. We have been using it in the previous part of the book. However, when you do some advanced file manipulation, the simple mode becomes inadequate. For example, it is more appropriate to use full mode when reading multiple files simultaneously. All functions of the I/O library are placed in table IO.
20.1 Simple I/O mode
All operations in the simple mode are on top of two current files. The I/O library takes the current input file as the standard input (stdin) and the current output file as the standard output (stdout). So when we execute IO. Read, which is the reading of a line in standard input. We can use the Io.input and Io.output functions to change the current file. For example, io.input (filename) opens the given file (in read mode) and sets it as the current input file. All of the input is then derived from this article until the io.input is used again. The Io.output function. Similar to Io.input. Once an error is generated, two functions will produce an error. If you want to control the error directly, you must use the Io.read function in full mode. The write operation is simpler than the read operation, we begin with the write operation. In the following example, the function Io.write gets any number of string arguments, and then writes them to the current output file. Typically numbers are converted to strings in accordance with the usual rules, and if you want to control this conversion, you can use the Format function in the string library:
> Io.write ("sin (3) =", Math.sin (3), "\ n")--sin (3) = 0.1411200080598672> Io.write (String.Format ("sin (3) =%.4 F\n ", Math.sin (3)))--sin (3) = 0.1411
You should avoid writing code like Io.write (A. B.. c); This writing is the same as Io.write (A,b,c). However, the latter consumes less resources because it avoids the concatenation operation. In principle, the print function is often used when you are programming roughly (Quickand dirty) or troubleshooting. Use write when full control of the output is required.
> Print ("Hello", "Lua"); Print ("HI")--Hello lua--> Hi > Io.write ("Hello", "Lua"), Io.write ("Hi", "\ n")--Helloluahi
The write function differs from the print function in that write does not append any extra characters to the output, such as a system table breaks, line breaks, and so on. There is also the write function that uses the current output file, and print always uses standard output. In addition, the print function automatically calls the ToString method of the parameter, so the table (tables) function (functions) and nil can be displayed.
The Read function reads the string from the current input file and controls what it reads by its parameters:
"*all" |
Read the entire file |
"*line" |
Read Next line |
"*number" |
Converts a value from a string |
Num |
Reads num characters to a string |
The Io.read ("*all") function reads the entire input file from its current location. If the current position is at the end of the file, or if the file is empty, the function returns an empty string. Since Lua effectively manages long string type values, a simple way to use a filter in Lua is to read the entire file into a string, after processing (for example, using a function gsub) and then writing to the output:
t = Io.read ("*all") --read the whole filet= string.gsub (t, ...) --Do thejob io.write (t) --Write the file
The following code is an example of a complete string processing. The contents of the file are encoded using the Quoted-printable code in MIME (multi-purpose Internet Mail Extension protocol). In this form, non-ASCII characters will be encoded as "=xx", where XX is the hexadecimal representation of the character value, and the "=" character representing the consistency is also required to be rewritten. The "mode" parameter in the Gsub function is to get all characters from 128 to 255, giving them an equal sign.
T =io.read ("*all") t =string.gsub (T, "([\128-\255=])", function (c) return String.Format ("=%02x", String.byte (c)) end) Io.write (t)
The program takes 0.2 seconds to convert 200k characters in the Pentium 333MHz environment.
The Io.read ("*line") function returns the next line of the current input file (without the last line break). When the end of the file is reached, the return value is nil (indicating that no next line can be returned). The read is the default way of the read function, so it can be abbreviated to Io.read (). It is common to read a file this way because the action on the file is naturally progressive, or it is more likely to use *all to read the entire file at a time, or to see a block-by-Chunk read of the file later. The following program demonstrates how you should use this pattern to read a file. This program copies the current input file to the output file and records the number of rows.
Local count = 1while true do local line = Io.read () if line = = nil and break end Io.write (String.Format ("%6d ", count), line," \ n ") count = count+ 1end<span style=" font-size:14px; font-family:arial, Helvetica, Sans-serif; b Ackground-color:rgb (255, 255, 255); " > </span>
However, in order to iterate through the entire file in a progressive line. We'd better use the Io.lines iterator. For example, the procedure for sorting a file's lines is as follows:
Local lines = {}--read thelines in table ' lines ' for line in Io.lines () does Table.insert (lines, line) end--Sorttable.sor T (lines)--write allthe linesfor I, L in Ipairs (lines) do Io.write (l, "\ n") end
On the Pentium 333MHz, the program handles 4.5MB sizes, and 32K lines of files take 1.8 seconds, 0.6 seconds faster than using highly optimized C-language system sequencing programs. The Io.read ("*number") function reads a numeric value from the current input file. Only the Read function will return a numeric value, not a string, under this parameter. When you need to read a large number of numbers from a file, the string between the numbers is an empty mortar that can significantly improve execution performance. The *number option skips any space between two recognizable digits. These recognizable strings can be-3, +5.2, 1000, and -3.4e-23. If a number cannot be found at the current location (either because it is not in the correct format or to the end of the file), NIL returns the option to set each parameter, and the function returns its own result. If there is a file that contains three numbers per line:
6.0 -3.23 15e124.3 234 1000001 ...
Now to print out the largest number of rows per line, you can use the Read function call to fetch all three digits of each line:
While true does local N1, n2, N3 = Io.read ("*number", "*number", "*number") if not N1 then break end print (math. Max (N1, n2,n3)) end
In any case, you should consider choosing the "*.all" option to use the Io.read function to read the entire file and then use the Gfind function to decompose:
Local Pat = "(%s+)%s+ (%s+)%s+ (%s+)%s+" for N1, N2, N3 in String.gfind (Io.read ("*all"), Pat) do print (Math.max (N1, N2,n 3)) End
In addition to the basic read method, you can also use the value n as the parameter of the Read function. In this scenario, the Read function attempts to read n characters from the input file. If you cannot read any characters (already to the end of the file), the function returns nil. Otherwise, a string with a maximum of n characters is returned. The following is an example program for efficient file copying of the read function parameter (in Lua, of course)
Local size = 2^13 --Good buffersize (8K) while True does local block = io.read (size) if not block then break end io.write (block) end
In particular, the Io.read (0) function can be used to test whether the end of the file has been reached. If you do not return an empty string, nil is returned at the end of the file.
20.2 full I/O mode
Full mode can be used for more complete control of the input and output. The core of the full pattern is the file handle (filename handle). This structure is similar to the file stream (file*) in C, which renders an open file and the current access location. The function that opens a file is Io.open. It mimics the C language of the fopen function, the same need to open the file name parameter, open the pattern of string parameters. The pattern string can be "R" (read mode), "W" (write mode, overwrite the data), or "a" (additional mode). And the character "B" can be appended to indicate that the file is opened in binary form. Normally the open function returns a handle to a file. If an error occurs, nil is returned, along with an error message and an error code.
Print (Io.open ("Non-existent file", "R"))--nil No suchfile or directory 2print (Io.open ("/etc/passwd", "W" )--nil Permission denied 13
The definition of the error code is determined by the system. The following is a typical code to check for errors:
Local F = assert (Io.open (filename, mode))
If the open function fails, the error message is displayed as an assert parameter, and the information is shown by assert. When the files are opened, they can be read and written using the read and write methods. They are similar to the Read/write function of the IO table, but differ on the calling method, and must be invoked as a method of the file handle using the colon character. For example, open a file and read it all. You can use the following code.
Local F = assert (Io.open (filename, "R")) Local t = F:read ("*all") F:close ()
Similar to stream settings in C, the I/O library provides three predefined handles: Io.stdin, Io.stdout, and Io.stderr. Therefore, you can send the message directly to the error stream using the following code.
Io.stderr:write (Message)
We can also mix full mode with simple mode. Use the Io.input () function without any parameters to get the current input file handle, and use the io.input (handle) function with parameters to set the input file that the current input file represents for the handle handle. (The same usage applies to the Io.output function) For example, to implement a temporary change to the current input file, you can use the following code:
Local temp = Io.input ()- -Save current Fileio.input ("Newinput")-- open a new currentfile ... --DoSomething with Newinputio.input (): Close () --close current Fileio.input (temp)
A small tip for 20.2.1 I/O optimization
Since it is common to read an entire file in Lua, it is much faster to read a file than one line. Although we sometimes target larger files (dozens of, hundreds of trillion), it is not possible to read them out at once. To process such files we can still read them for a period of time (for example, 8KB). Also, to avoid cutting lines in the file, add a line after each paragraph:
Local lines, rest =f:read (BUFSIZE, "*line")
Rest in the above code preserves any rows that may be cut off by segment partitioning. Then the segments (chunk) and lines are then connected together. So that each segment ends with a complete line. The following code is a more typical use of this technique. The program implements a count of the number of characters, words, and lines of input files.
</pre></div><pre name= "code" class= "CSharp" >local BUFSIZE = 2^13 --8Klocal f = io.input (Arg[1]) c1/>--open inputfilelocal cc, LC, WC = 0, 0, 0 --char, Line,and word counts and true do local lines, rest =f:read (BUFSIZE, "*line") if not lines then break end If rest then lines = lines: Rest ... ' \ n ' End cc = CC +string.len (lines) --count Wordsin the chunk local _,t = String.gsub (lines, "%s+", "") WC = WC + t --count Newlinesin the chunk _,t = string.gsub (lines, "\ n", "\ n") LC = LC + T EndPrint (LC, WC,CC )
20.2.2 binary files
The default simple mode is always open in text mode. There is no difference between binary and text files in Unix, but in systems such as Windows, binaries must open files with explicit markup. To control such a binary file, you must add the "B" tag in the format string parameter of the Io.open function. The control of binaries in Lua is similar to the text of a binary file. A string can contain any byte value, and almost all of the functions in the library can be used to manipulate any byte value. (You can even compare binary "strings" in a pattern, as long as there are no 0 values in the string.) If you want to make a 0-value byte match, you can use%Z instead) so using *all mode is reading the value of the entire file, using the number n is reading the value of n bytes. The following is a simple program that converts a text file from DOS mode to Unix mode. (This conversion process is to replace the "carriage return character" with the "newline character".) Because is in binary form C manuscript is the Text Mode!!??) When these files are opened, the standard input input file (stdin/stdout) cannot be used here. So use the parameters provided in the program to get the input and output file name.
<pre name= "code" class= "CSharp" ><pre name= "code" class= "CSharp" >local InP = Assert (Io.open (arg[1], "RB")) Local out = Assert (Io.open (arg[2], "WB")) Local data = Inp:read ("*all") data = string.gsub (data, "\ r \ n", "\ n") out:write (dat A) Assert (Out:close ())
You can use the following command line to invoke the program.
> Luaprog.lua file.dos File.unix
The second example program: Prints all the specific strings found in the binary file. The program defines a specific string with a minimum of six "valid characters", ending with a 0-byte value. (the "valid characters" in this program are defined as text numbers, punctuation marks, and spaces, defined by variable validchars.) In the program we use the join and STRING.REP functions to create the validchars, ending with%z to match the 0 end of the string.
Local F = assert (Io.open (arg[1], "RB")) Local data = F:read ("*all") Local validchars = "[%w%p%s]" Local pattern = String.rep ( Validchars, 6) ... " +%z "For W in string.gfind (data, pattern) do print (W) end
The last example: The program carries out a one-time value analysis of the binary file (get an interface similar to the hex editor shown, Dump). The first parameter of the program is the input file name, and the output is standard output. It reads the file in 10 bytes for a period, displaying the hexadecimal representation of each segment of the byte. The paragraph is then written out as text, and the control character is converted to a point number.
Local F = assert (Io.open (arg[1], "RB")) Local block = 10while true do Local bytes = F:read (block) if not bytes then Break End- for-B in String.gfind (bytes, ".") does Io.write (String.Format ("%02x", String.byte (b))) End Io.write (String.rep (" ", Block-string.len (bytes) + 1)) io.write (string.gsub (bytes, "%c", "."), "\ n") end
If you name the program script file as a VIP. You can use the following command to execute the program to handle itself:
Prompt> LUAVIP VIP
in a Unix system it will produce the following output style:
6C 6F, 6C, 3D, local f =61------------6F 2E (io.6f), 6E 2 c) (0A], "RB") .... 2C (2E) ("%c", ".") 2 c 5C 6E 0A 6E, "\ n"). En64 0A D.
20.3 Other operations on the file
The
function Tmpfile function is used to return a handle to the zero file, and its open mode is Read/write mode. The
function fsize (file) Local current = File:seek () --Get currentposition Local size = File:seek ("End") - -Get filesize file:seek ("set", current)- -Restore position return Sizeend
Several of these functions will return a nil value that contains the error message when the error occurs.
Lua_ 20th IO Library