Regular Expressions in Perl

Source: Internet
Author: User
Tags perl interpreter

Three Regular Expressions
Common pattern in Regular Expressions

8 principles of Regular Expressions

 

Regular Expressions are a special feature of the Perl language and also PerlProgramBut if you can master it well, you can easily use regular expressions to complete string processing tasks. Of course, you can be more comfortable in CGI programming. Below are some basic syntax rules for writing regular expressions.

1) three forms of Regular Expressions
First, we should know that there are three regular expressions in the Perl program:
Match: M/<Regexp>;/(can also be abbreviated as/<Regexp>;/, skip m)
Replace: S/<pattern >;/ <replacement> ;/
Conversion: TR/<pattern>;/<replacemnt> ;/
The three forms are generally equal to = ~ Or !~ Use together (where "= ~ "Indicates a match, which is read as does in the entire statement ,"!~ "Does not match, read as doesn' t in the entire statement), and the scalar variable to be processed on the left. If this variable does not exist and = ~ !~ Operator, the content in the $ _ variable is processed by default. Example:
$ STR = "I love Perl ";
$ STR = ~ M/perl/; # indicates that if the "Perl" string is found in $ STR, "1" is returned; otherwise, "0" is returned ".
$ STR = ~ S/perl/bash/; # Replace the "Perl" string in the variable $ STR with "bash". If this replacement happens, "1" is returned; otherwise, "0" is returned ".
$ Str !~ TR/A-Z/a-z/; # indicates converting all uppercase letters in the variable $ STR to lowercase letters. If the conversion happens, "0" is returned; otherwise, "1" is returned ".
There are also:
Foreach (@ array) {S/a/B/;} # Here, each loop extracts an element from the @ array and stores it in the $ _ variable, and replace $.
While (<file >;) {print if (M/error/) ;}# this sentence is a little more complex. It prints all lines in the file containing the error string.
If () appears in the Regular Expression of Perl, the pattern in () is automatically assigned to the system $1 by the perl interpreter after matching or replacement, $2 ...... see the following example:
$ String = "I love Perl ";
$ String = ~ S/(love)/<$1>; //; #$1 = "love" at this time, and the result of this replacement is to change $ string to "I <love>; Perl"
$ String = "I love Perl ";
$ String = ~ S/(I )(. *) (Perl)/<$3>; $2 <$1>; //; # Here $1 = "I", $2 = "love ", $3 = "Perl", with $ string changed to "<Perl>; love <I> ;"
Replace operation S/<pattern>;/<replacement>;/You can add the E or G parameter at the end. Their meanings are as follows:
S/<pattern>;/<replacement>;/G indicates replacing all the modes that match <pattern>; in the string to be processed with <replacement>; string, instead of replacing the first appearance mode.
S/<pattern>;/<replacement>;/e indicates that the <replacemnet>; part is treated as an operator. this parameter is rarely used.
For example:
$ String = "I: Love: Perl ";
$ String = ~ S/:/*/; # $ string = "I * Love: Perl ";
$ String = "I: Love: Perl ";
$ String = ~ S/:/*/g; # $ string = "I * Love * Perl ";
$ String = ~ TR/* //; # $ string = "I love Perl ";
$ String = "www22cgi44 ";
$ String = ~ S/(\ D +)/$1*2/E; # (/d +) represents one or more numeric characters in $ string, perform the * 2 operation on these numeric characters, so the last $ string is changed to "www44cgi88 ".
The following is a complete example:
#! /Usr/bin/perl
Print "enter a string! \ N ";
$ String = <stdin >;#< stidn>; indicates the standard input, which allows the user to enter a string.
Chop ($ string); # Delete the character \ n from the last line break of $ string
If ($ string = ~ /Perl /){
Print ("the input string contains the Perl string! \ N ";
}
If the input string contains the Perl string, the following prompt is displayed.

 

2) common patterns in Regular Expressions
Below are some common patterns in regular expressions.
/Pattern/result
. Match All characters except line breaks
X? Match 0 times or once x string
X * matches 0 or multiple times X strings, but the minimum number of possible matches
X + matches the string once or multiple times, but the minimum number of possible matches
. * Match any character 0 or once
. + Match any character once or multiple times
{M} matches a specified string of exactly M.
{M, n} matches a specified string of more than n m
{M,} matches more than m specified strings
[] Match characters in []
[^] Does not match characters in []
[0-9] Match All numeric characters
[A-Z] matches all lowercase letter characters
[^ 0-9] match all non-numeric characters
[^ A-Z] matches all non-lowercase letter characters
^ Match characters starting with a character
$ Match characters at the end of a character
\ D matches the character of a number, which is the same as the [0-9] syntax.
\ D + matches multiple numeric strings, the same as the [0-9] + syntax
\ D is not a number; others are the same as \ D
\ D + non-numeric, others are the same as \ D +
A string of \ W English letters or numbers, the same as the [a-zA-Z0-9] syntax
\ W + the same syntax as [a-zA-Z0-9] +
\ W a string of Non-English letters or numbers, the same as the [^ a-zA-Z0-9] syntax
\ W + and [^ a-zA-Z0-9] + syntax is the same
\ S space, which is the same as the syntax of [\ n \ t \ r \ F]
\ S + is the same as [\ n \ t \ r \ f] +
\ S is not a space, and the syntax is the same as [^ \ n \ t \ r \ F]
The syntax of \ s + is the same as that of [^ \ n \ t \ r \ f] +.
\ B matches strings with English letters and numbers
\ B matches strings that do not contain English letters and numbers.
A | B | C: a string that matches the character, B character, or C character
ABC matches strings containing ABC
(Pattern) () This symbol remembers the searched string, which is a very useful syntax. The string found in the first () is changed to the $1 variable or the \ 1 variable, and the string found in the second () is changed to the $2 variable or the \ 2 variable, and so on.
The/pattern/I parameter indicates that the English case is ignored, that is, when matching strings, the English case is not considered.
\ If you want to find a special character in pattern mode, such as "*", you must add the \ symbol before the character to invalidate the special character.
The following are some examples:
Example
/Perl/find a string containing Perl
/^ PERL/find a string starting with Perl
/Perl $/find the string ending with Perl
/C | G | I/find a string containing C, G, or I
/CG {2, 4} I/find C followed by 2 to 4G, followed by the I string
/CG {2,} I/find C followed by more than 2g, followed by the I string
/CG {2} I/find C followed by 2g, followed by the I string
/CG * I/find C followed by 0 or more G, followed by the I string, AS/CG {0, 1} I/
/CG + I/find C followed by more than one G, followed by the I string, like/CG {1,} I/
/CG? I/find C followed by 0 or 1g, followed by the I string, AS/CG {0, 1} I/
/C. I/find C followed by an arbitrary character, followed by the string of I
/C. I/find C followed by two arbitrary characters, followed by the I string
/[CGI]/find a string that matches any of the three characters
/[^ CGI]/find any one of the three characters
/\ D/search for numbers. You can use/\ D +/to represent a string consisting of one or more numbers.
/\ D/find a character that matches a non-numeric character. You can use/\ D +/to represent a string consisting of one or more non-numeric characters.
/\ */Find the character that matches *. Because * has its special meaning in a regular expression, you must add the \ symbol before the special symbol to invalidate this special character.
/ABC/I: Find strings that match ABC, regardless of the Case sensitivity of these strings.

 

3) Eight Principles of Regular Expressions
If SED, awk, and grep commands have been used in UNIX, it is believed that the regular expression (Regular Expression) in Perl will not be unfamiliar. The PERL language has this function, so it is very capable of processing strings. In programs in the Perl language, you can often see the use of regular expressions, which is no exception in CGI programming.
Regular Expressions are difficult for beginners in Perl, but once you have mastered the syntax, You can have almost unlimited pattern matching capabilities, and most of the work of Perl programming is to master regular expressions. The following describes the eight principles used in regular expressions.
Regular Expressions can form a large consortium in the battle against data-this is often a war. We should remember the following eight principles:
· Principle 1: regular expressions have three different forms (matching (M //), replacement (S // EG), and conversion (TR ///)).
· Principle 2: Regular Expressions only match scalar values ($ scalar = ~ M/A/; can work; @ array = ~ M/A/will treat @ array as a scalar, so it may not succeed ).
· Principle 3: Regular Expressions match the earliest possible match of a given pattern. Lack of time, only match or replace the regular expression once ($ A = 'string string2'; $ A = ~ S/string //; causes $ A = 'string 2 ').
· Principle 4: Regular Expressions can process any and all characters that double quotation marks can process ($ A = ~ M/$ varb/extend varb to a variable before matching. If $ varb = 'A' $ A = 'as', $ A = ~ S/$ varb //; equivalent to $ A = ~ S/A //;, the execution result is $ A = "S ").
· Principle 5: the regular expression produces two situations in the value evaluation process: Result Status and reverse reference: $ A = ~ M/pattern/indicates whether the child string pattern appears in $ A, $ A = ~ S/(word1) (word2)/$2 $1/the word "change.
· Principle 6: the Core Competence of Regular Expressions lies in wildcards and Multiple matching operators and how they operate. $ A = ~ M/\ W +/matches one or more word characters; $ A = ~ M/\ D/"matches zero or multiple numbers.
· Principle 7: To match more than one character set, Perl uses "|" to increase flexibility. If M/(cat | dog)/is input, it is equivalent to "matching string cat or dog.
· Principle 8: Perl (?..) The Syntax provides extended functions for regular expressions. (Please read related materials after class)
Want to learn all these principles? I suggest you start with a simple process and keep trying and experimenting. In fact, if you learn $ A = ~ M/error/is to find the sub-string error in $ A, so you have gained a greater processing capability than in a lower-level language such as C.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.