Open the file with the open () function
The common ways to open files are:
Copy Code code as follows:
Open (FH, "< $filename")
Or die "couldn ' t open $filename for reading: $!";
The open () function usually comes with two arguments, the first is a file handle, it is used to point to an open file, the second parameter is a mixture of the file name and the mode (open mode of the file), or False if the file is opened successfully, the open () function returns True. We use "or" to test the condition.
The pattern in the above code is represented by a small Yu Gifu (<). If the file does not exist, open () returns false. At this point, you can read the file handle, but not write.
is greater than the character representation write. If the file does not exist, it will be created. If the file exists and the file is purged, the previous data will be lost. You can write a file handle, but you can't read it.
Copy Code code as follows:
# Create a file if it doesn't exist
Open (FH, "> $filename")
Or die "couldn ' t open $filename for writing: $!";
If the file does not exist, the Add mode (represented by two greater than symbols) can be used to create a new file, and if the file exists, the pattern does not erase the original data.
As with "<" or "read" mode, you can write to a file handle only. (So the write content is added to the end of the file). An attempt to read will result in a run-time error.
Copy Code code as follows:
Open (FH, ">> $filename")
Or die "couldn ' t open $filename for appending: $!";
Through "+<" mode, you can both read and write files. You can move within a file through the tell () function and navigate through the Seek () function. If the file does not exist, it will be created. If the file already exists, the original data will not be purged.
If you intend to clear the original file contents, or call the truncate () function yourself, or use the "+>" mode.
Copy Code code as follows:
Open (FH, "+> $filename")
Or die "couldn ' t open $filename for reading and writing: $!";
Notice the difference between "+<" and "+>", both of which can be read and writable. The former is non-destructive writing, the latter is destructive writing.
Error
How did the error come about? Errors can occur in many places: directories do not exist, files are not writable, your program loses file handles, and so on.
You should check the results of system calls (such as open () and Sysopen ()) to see if the call succeeds.
To help users with error checking, you should always use the "or Die ()" to remember these usages. First, you should write out the information for the system call failure ("open"). Second, you should write information about the filename so that you can easily locate it when correcting the error. Third, write the way to open the file ("for writing," "for appending"). The output operating system error information (included in the $!). This way, once the file doesn't open, the user who uses your program will generally know why it can't be turned on. Sometimes we combine the first with the third:
Or die "Unable to append to $filename: $!";
If you write the full name of the file in both open () and error messages, you risk changing the open () so that the error message is outdated or incorrect.
Copy Code code as follows:
# A false error message will appear below
Open (FH, "</var/run/file.pid")
Or die "Can ' t open/var/log/file.pod for writing: $!";
Use Sysopen () for more control
For better control of how files are opened, you can use the Sysopen () function:
Copy Code code as follows:
Use Fcntl;
Sysopen (FH, $filename, o_rdwr| O_creat, 0666)
Or die "Can ' t open $filename for reading/writing/creating: $!";
function Sysopen () with four parameters, the first is a file handle parameter similar to the open () function, the second parameter is a filename without schema information, and the third parameter is a pattern parameter consisting of a constant composed of the logical or operations provided by the Fcntl module, the fourth argument (optional), is an octal attribute value (0666 represents a data file, 0777 represents a program). If the file can be opened, Sysopen () returns True and False if the open fails.
Unlike the open () function, Sysopen () does not provide a shorthand for the pattern description, but rather combines some constants, and each pattern constant has a unique meaning, which can be combined only by logical or operations, and allows you to set up a combination of multiple behaviors.
O_rdonlyread-only
O_wronly write-only
O_rdwr Reading and writing
O_append writes go to the "end of" the file
O_trunc Truncate The file if it existed
O_creat Create The file if it didn ' t exist
O_exclerror if the file already existed (used with o_creat)
When you need to be careful, use the Sysopen () function, for example, if you intend to add content to a file, if the file does not exist, and do not create a new file, you can write this:
Sysopen (LOG, "/var/log/myprog.log", O_append, 0666)
Or die "Can ' t open/var/log/myprog.log for appending: $!";
Read a single record
There is an easy way to read filehandles: Use the <FH> operator. In scalar content, it returns the next record in the file, or it returns an undefined error message. We can use it to read a line into a variable:
$line = <FH>;
Die "unexpected End-of-file" unless defined $line;
In a looping statement, we can write this:
Copy Code code as follows:
while (defined ($record = <FH>)) {# long-winded
# $record is set to each record in the file, one at a time
}
Because it takes a lot of work to do this, it's usually simpler,
Put the records in the $_ instead of the $record:
Copy Code code as follows:
while (<FH>) {
# $_ each time for a record in the file
}
In Perl 5.004_04, we can do this:
while ($record = <FH>) {
# $record each time a record in the file
}
Defined () automatically adds that the command gives a warning in previous versions of Perl 5.004_04. To understand the version of Perl you are using, you can enter at the command line:
Perl-v
Once we read out a record, we usually intend to remove the record delimiter (the default is the newline character):
Chomp ($record);
The Perl 4.0 version has only the chop () operation, which removes the last character of the string, regardless of what the character is. Chomp () is not so destructive, if a row delimiter exists, it only removes the row separator. If you are going to remove the line separator, use Chomp () instead of chop ().
Read multiple records
If you call <fh>, return the remaining records in the file. If you are at the end of the file, return an empty table:
Copy Code code as follows:
@records = <FH>;
if (@records) {
Print "There were", scalar (@records), "Records read."
}
In the following step, you assign values and test two tasks:
Copy Code code as follows:
if (@records = <FH>) {
Print "There were", scalar (@records), "Records read."
}
Chomp () can also be applied to array operations:
@records = <FH>;
Chomp (@records);
For any expression, you can perform chomp operations, so you can write this in the following step:
Chomp (@records = <FH>);
What is a record?
The default definition of a record is "line."
The definition of a record is controlled by the $/variable, which holds the separator for the record entered, because the newline character (by definition!). is used to separate rows, so the default value is string "".
For example, you can replace "" with any symbol you want to replace.
$/ = ";";
$record = <FH>; # Read the next semicolon-delimited record
$/can take two other interesting values: empty strings ("") and undef.
Read into paragraph
$/= "" is used to instruct Perl to read a paragraph, which is a block of text consisting of two or more line breaks. This differs from setting to "", which reads only a block of text consisting of two rows. In this case, a problem arises: if there is a continuous empty row, for example, you can interpret it as either a paragraph ("text") or two paragraphs ("text", followed by two line breaks, and an empty paragraph followed by two blank lines). )
The second explanation is not useful when reading text. If the paragraph you are reading appears above, you do not have to filter out "empty" paragraphs.
Copy Code code as follows:
$/ = " ";
while (<FH>) {
Chomp
Next unless length; # Skip Empty Segment
# ...
}
You can set the $/to Undef, which is used to read a paragraph followed by two or more line breaks: undef $/;
while (<FH>) {
Chomp
# ...
}
Read the entire file
Other interesting values for $/are undef. If set to this value, Perl is told that the read command returns the rest of the file as a string:
Copy Code code as follows:
Because changing the value of the $/, it will affect each subsequent read operation, not just the next read operation. Typically, you need to limit the operation to local. The following example allows you to read the contents of a file handle into a string:
Copy Code code as follows:
{
Local $/= undef;
$file = <FH>;
}
Remember: Perl variables can be read in very long strings. Although your file size cannot exceed the limit of your virtual memory capacity, you can still read as much data as possible.
To manipulate a file with a regular expression
Once you have a variable that contains the entire string, you can use a regular expression to manipulate the entire file, rather than to manipulate a block in the file. There are two useful regular expressions labeled/s and/M. In general, Perl regular expressions work with rows, so you can write:
Copy Code code as follows:
Undef $/;
$line = <FH>;
if ($line =~/(B.*grass) $/) {
print "Found";
}
If we fill in the following documents:
Browngrass
Bluegrass
The output is:
Found bluegrass
It does not find "browngrass" because it looks for a match only at the end of a string (or a row before the string ends). If you match with "^" and "$" in a string that contains many rows, we can use the/m ("multiline") option:
if ($line =~/(B.*grass) $/m) {}
Now the program will output the following information:
Found Browngrass
Similarly, a period can match all characters except line breaks:
Copy Code code as follows:
while (<FH>) {
if (/19 (. *) $/) {
if (< 20) {
$year = 2000+;
} else {
$year = 1900+;
}
}
}
If we read "1981" from the file, $_ will contain "1981". The period in the regular expression matches "8" and "1", but does not match "". This is necessary because line breaks are not part of the date.
For a string that contains a lot of rows, we might want to extract a large chunk of them that might cross the row separator. In this case, we can use the/s option and a period to match all characters except line breaks.
Copy Code code as follows:
if (MS) {
Print "Found bold text:";
}
Here, I use {} to represent the start and end of a regular expression without a slash, so I can tell Perl I'm matching the starting character with "M" and the end character "s." You can use the/s and/M options in combination:
Copy Code code as follows:
if (M{^<font color= "Red" > (. *?) </FONT>}SM) {
# ...
}
Summarize
There are two ways to open a file: The open () function is quick and simple, and the Sysopen () function is powerful and complex. Through the <FH> operator, you can read a record, $/variable allows you to control what the record is. If you are going to read the contents of many rows into a string, do not use the two regular expression tags that forget/s and/M.