Classic usage of Perl

Source: Internet
Author: User
Tags chop


Open a file using the Open () function

Common methods to open a file are:

The Code is as follows: open (FH, "<$ filename ")
Or die "Couldn't open $ filename for reading: $! ";

The open () function usually has two parameters. The first is the file handle, which is used to point to the opened file. The second parameter is a mixture of the file name and mode (File opening mode, if the file is successfully opened, the open () function returns true; otherwise, the value is false. We use "or" to test this condition.
The mode in the above Code is represented by a character smaller than (<. If the file does not exist, open () returns false. In this case, you can read the file handle but cannot write it.
A value greater than a character indicates writing. If the file does not exist, it will be created. If the file exists and the file is cleared, the previous data will be lost. You can write a file handle, but cannot read it.

The Code is as follows: # If the file does not exist, create it.
Open (FH, "> $ filename ")
Or die "Couldn't open $ filename for writing: $! ";

If the file does not exist, the Add mode can be used to create a new file. If the file exists, the mode does not clear the original data.
Like the "<" or "read" mode, you can only write the file handle. (Therefore, all the written content is added to the end of the file ). An error occurs when you attempt to perform a read operation.


The Code is as follows: open (FH, ">$ filename ")
Or die "Couldn't open $ filename for appending: $! ";

In the "+ <" mode, you can read and write files. You can use the tell () function to move inside the file and locate it through the seek () function. If the file does not exist, it will be created. If the file already exists, the original data will not be cleared.
If you want to clear the original file content, or call the truncate () function, or use the "+>" mode.


The Code is as follows: open (FH, "+> $ filename ")
Or die "Couldn't open $ filename for reading and writing: $! ";

Note the differences between "+ <" and "+>". Both can be readable and writable. The former is non-destructive writing, and the latter is destructive writing.
Error
How does an error occur? Errors may occur in many places: for example, the directory does not exist, the file cannot be written, and your program loses the file handle.
You should check the system call results (such as open () and sysopen () to see if the call is successful.
"Or die ()" is usually used to help users identify errors. Remember these usage. First, the system call failure ("open") Information should be written. Second, the file name information should be written so that it is easier to locate errors when correcting them. Third, write out the method for opening the file ("for writing," "for appending "). Fourth, output the error information of the Operating System (included in $! ). In this way, once the file cannot be opened, the user using your program will generally know why it cannot be opened. Sometimes we merge the first and third objects:
Or die "unable to append to $ filename: $! ";

If you write the full name of the file in both open () and error messages, you may be at risk of changing open (), making the error information inappropriate or incorrect.

the Code is as follows: # The following is a false error message.
Open (FH, "</var/run/file. pid ")
Or die "Can't open/var/log/file. pod for writing: $! ";

Use Sysopen () for more control
To better control the file opening method, you can use the sysopen () function:

The Code is as follows: use Fcntl;
Sysopen (FH, $ filename, O_RDWR | O_CREAT, 0666)
Or die "Can't open $ filename for reading/writing/creating: $! ";

The sysopen () function has four parameters. The first parameter is a file handle parameter similar to the open () function, the second parameter is the file name without the mode information, and the third parameter is the mode parameter, A constant composed of logical OR operations provided by the Fcntl module. The fourth parameter (optional) is the octal value (0666 indicates the data file, and 0777 indicates the program ). If the file can be opened, sysopen () returns true. If the file fails to be opened, false is returned.
Different from the open () function, sysopen () does not provide a short description of the pattern. Instead, it combines some constants, and each pattern constant has a unique meaning, only logical OR operations can combine them. You can set a combination of multiple actions.
O_RDONLYRead-only
O_WRONLY Write-only
O_RDWR Reading and writing
O_APPEND Writes go to the end of the file
O_TRUNC Truncate the file if it existed
O_CREAT Create the file if it didn't exist
O_EXCLError if the file already existed (used with O_CREAT)

When you need to be careful, use the sysopen () function. For example, if you want to add content to a file, if the file does not exist and you do not create a new file, you can write it like this:
Sysopen (LOG, "/var/log/myprog. log", O_APPEND, 0666)
Or die "Can't open/var/log/myprog. log for appending: $! ";

Read a single record
There is an easy way to read filehandles: Use the <FH> operator. In the scalar content, it returns the next record in the file or the undefined error message. We can use it to read a row into a variable:
$ Line = <FH>;
Die "Unexpected end-of-file" unless defined $ line;
In the loop statement, we can write as follows:

The Code is as follows: while (defined ($ record = <FH>) {# long-winded
# $ Record is set to each record in the file, one at a time
}

This is because it requires a lot of such work, which is usually simplified,
Put the record in $ _ instead of $ record:

The Code is as follows: while (<FH> ){
# $ _ Each time it is a record in the file
}
In Perl 5.004 _ 04, we can do this:
While ($ record = <FH> ){
# $ Record each time is a record in the file
}

Defined () is automatically added. In versions earlier than Perl 5.004 _ 04, this command provides a warning. To understand the Perl version, run the following command:
Perl-v
Once we read a record, we usually intend to remove the record separator (default value: Line Break ):
Chomp ($ record );
In Perl 4.0, only the chop () operation is performed. The last character of the string is removed, no matter what the character is. Chomp () is not so destructive. If a line separator exists, it only removes the line separator. If you want to remove the line separator, use chomp () instead of chop ().
Read multiple records
If you call <FH>, the remaining records in the file are returned. If you are at the end of the file, an empty table is returned:

The Code is as follows: @ records = <FH>;
If (@ records ){
Print "There were", scalar (@ records), "records read .";
}

In the following step, assign values and test two tasks:

The Code is as follows: if (@ records = <FH> ){
Print "There were", scalar (@ records), "records read .";
}

Chomp () can also be used for Array Operations:
@ Records = <FH>;
Chomp (@ records );
You can perform the chomp operation on any expression, so you can write it in the following step:
Chomp (@ records = <FH> );

What is record?
The default record is "row ".
The record definition is controlled by the $/variable. This variable stores the delimiter of the input record, because the line break (according to definition !) Is used to separate rows, so its default value is the string "".
For example, you can replace "" with any symbol you want to replace.
$/= ";";
$ Record = <FH>; # Read the next record separated by semicolons
$/You can take the other two interesting values: Null String ("") and undef.
Read paragraph
$/= "" Is used to indicate that Perl reads a paragraph. A paragraph is a text block consisting of two or more line breaks. This is different from setting it to "". The latter only reads text blocks consisting of two lines. In this case, the following problem occurs: If a continuous empty row exists, such as "text", you can either interpret it as a paragraph ("text "), it can also be interpreted as two paragraphs ("text", followed by two linefeeds, and an empty section, followed by two blank lines .)
When reading the text, the second explanation is not very useful. If the paragraph you are reading has the above situation, you do not have to filter out the "empty" section.

eThe Code is as follows: $/= "";
While (<FH> ){
Chomp;
Next unless length; # skip an empty segment
#...
}

You can set $/to undef, which is used to read the section following two or more linefeeds: undef $ /;
While (<FH> ){
Chomp;
#...
}

Read the entire file
$/'S other interesting values are undef. If this value is set, Perl is told that the READ command returns the remaining part of the file as a string:

The Code is as follows: undef $ /;
$ File = <FH>;

Because the value of $/is changed, it will affect each subsequent read operation, not only the next read operation. Generally, you need to restrict the operation to a local location. The following example reads the content of the file handle into a string:

The Code is as follows :{
Local $/= undef;
$ File = <FH>;
}

Remember: Perl variables can read long strings. Although your file size cannot exceed your virtual memory capacity, you can still read as much data as possible.
Operations on files using regular expressions
Once you have a variable that contains the entire string, you can use a regular expression to operate the entire file rather than a block in the file. There are two useful regular expressions to mark/s and/m. Generally, the regular expression in Perl is used to process rows. You can write it like this:

The Code is as follows: undef $ /;
$ Line = <FH>;
If ($ line = ~ /(B. * grass) $ /){
Print "found ";
}

If we fill in the following content for our file:
Browngrass
Bluegrass
The output is:
Found bluegrass
It does not find "browngrass", because $ only searches for matching at the end of the string (or a row before the end of the string ). If you use "^" and "$" to match strings that contain many rows, you can use the/m ("multiline") option:
If ($ line = ~ /(B. * grass) $/m ){}
Now the program will output the following information:
Found browngrass
Similarly, a period can match all characters except line breaks:

The Code is as follows: while (<FH> ){
If (/19 (. *) $ /){
If (<20 ){
$ Year = 2000 +;
} Else {
$ Year = 1900 +;
}
}
}

If we read "1981" from the file, $ _ will include "1981 ". The period in the regular expression matches "8" and "1", but does not match "". This is what we need to do here, because linefeeds are not part of the date.
For a string containing many rows, we may need to extract large blocks that may span line separators. In this case, we can use the/s option and use a period to match all characters except line breaks

Code: if (MS ){
Print "Found bold text :";
}

Here, I used {} to indicate the start and end of the regular expression without a slash. So, I can tell Perl that I am matching and the start character is "m ", the end character is "s ". You can combine the/s and/m options:

The Code is as follows: if (m {^ <font color = "red"> (.*?) </FONT>} sm ){
#...
}

Summary
There are two ways to open a file: the open () function is fast and simple, while the sysopen () function is powerful and complex. The <FH> operator allows you to read a record, and $/variable allows you to control what the record is. If you want to read the contents of many rows into a string, do not mark the rows with the/s and/m regular expressions.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.