Regular Expression. python is used as an example.

Source: Internet
Author: User

Regular Expression. python is used as an example.

The original address and author must be specified for reprinting.

The purpose of a regular expression is to quickly process string content. It is mainly used to find the specified string and complete the task with other operations.
When using regular expressions, you must understand the features of your language,The Regular Expression in python is greedy by default., That is, if there is no limit, match strings as much as possible.

 

0x00 basic syntax

1.: match any character except line breaks. 2 \ s: match any blank space character. 3 \ d: matching number. 4 \ D: match any non-number, which is equivalent to [^ 0-9]. 5 \ B: match the word line. 6 \ B: match the non-dividing line of a word. 7 ^: match the start of the string. 8 $: End of matching string. 9 \ w: match any word characters that contain underscores. similar but not equivalent to "[A-Za-z0-9 _]", Unicode Character Set, can match Chinese characters depends on the application environment. 10 \ W: match any non-word characters. equivalent "[^ A-Za-z0-9 _]". 11 \: escape. The subsequent characters are not interpreted as expressions. 12 *: Match * the first character 0 or n times, equivalent to {0 ,}. 13 +: match the character before the plus (+) number once or n times. equivalent to {1 ,}. 14? : Match? The first character 0 or 1 is equivalent to {0, 1 }. 15 (x): Match 'X' and record the matched value. 16 x │ y: Match expression x or y.17 {n}: match the specified CHARACTER n times. 18 {n ,}: match the specified character at least n times. 19 {n, m}: match the specified CHARACTER n-m times. 20 [xyz]: Single Character List, matching any character in brackets. the hyphen '-' indicates the character range. 21 [^ xyz]: The complement set of the single character list. It matches all the characters except the listed characters and can also indicate the character range. 22 \ f: match a form character. 23 \ n: match a line break. 24 \ r: match a carriage return. 25 \ s: match a single white space character, including space, tab, form feed, line feed, equivalent to [fnrtv]. 26 \ S: match a single character except the white space character, equivalent to [^ fnrtv]. 27 t: match a tab. 28 v: match a top tab. 29 \: Mark the next character as a special character, a literal character, a back reference, or an octal escape character.

 

0x01 python Regular Expression Module
Import re
Import python Regular Expression Module

Re. compile (str)
It is used to compile a string-type regular expression into a regular expression object. You can directly use the re method for this object.
Returns pattern.

Re. findall (pattern, str)
The string is processed according to the regular expression. The first parameter is a matching regular expression, which can receive regular expressions of pattern and string type, and the second parameter is a string to be processed.
Return list

Re. sub (pattern, replace, string, count)
Replace the content matched by the regular expression in the string with the specified content.
The second function is a replacement string.
The fourth parameter indicates the number of replicas. The default value is 0, indicating that each matched content is replaced.
Return str
Re. sub also allows the use of functions to replace matching items for complex processing (parameter 2 ). For example, re. sub (r '\ s', lambda m:' ['+ m. group (0) + ']', text, 0); replace the space ''In the string with '[]'.

Re. split (pattern, str)
It is used to split a string using the content matched by a regular expression as a separator.
Return list

Re. match (pattern, string, flags)
Try to match from the start of the string. If the start of the string does not match the regular expression, the match fails.
Match is returned successfully, and None is returned if the request fails.
(The content can be viewed using the group () method .)
The third parameter is the Peugeot bit, which is used to control the matching mode of regular expressions, such as case-sensitive or multi-line matching.

Re. search (pattern, string, flags)
Search for the entire string, find the first match, and then return. If the string does not match, return None.
The parameter is the same as re. match.
Match is returned successfully, and None is returned if the request fails.

 

The above pattern can also be replaced with a string of the str type, with the same effect.

 

0x02 chestnuts


IPv4 address

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

URL (also the smallest non-Greedy matching chestnut)

http://.*?/

Matching file suffix

.*\.php

Match the first space of a line

^\s

Matching the same word is case insensitive

[Tt]ools

 

 

If any errors occur, please leave a message to correct them. In the future, more information will be provided.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.