Basic knowledge-Regular expressions in Golang

Last Update:2018-01-06 Source: Internet
Author: User
Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞
This is a creation in Article, where the information may have evolved or changed.
------------------------------------------------------------Regular Expressions in Golang------------------------------------------------                   ------------usage:------------------------------single:. Match any one of the characters, if set S = True, you can match the newline character [character class] to match one of the characters in the "character class", and the "character class" in the following description [^ character class] matches a character outside the character class, "the word        The following description \ Lowercase perl tag matches one of the characters in the Perl class, "Perl class" see the following description \ Uppercase Perl tag matches a character outside of "Perl class", "Perl class" see the following description [: ASCII class name:] Matches one character in the ASCII class, "ASCII class" see the following description [: ^ascii class Name:] Matches a character outside the ASCII class, "ASCII class" see the following description \punico        The de generic class name matches one of the characters in the Unicode class (normal class only), and the "Unicode class" is described in the following description \punicode the generic class name matches a character outside of the Unicode class (normal class only), and the "Unicode Class" is described later The \p{unicode class name} matches one of the characters in the Unicode class, and the Unicode class follows the description \p{unicode class name} matches a character outside of the Unicode class, as described in the following description------- -----------------------comp: XY matches xy (x follows y) x|y matches x or Y (priority match X)------------------   ------------repeat: x* matches 0 or more x, preferentially matches more (greedy) x+          Match one or more x, preferentially match more (greedy) x? Match 0 or one X, preferentially match one (greedy) x{n,m} match N to M x, precedence match more (greedy) x{n,} matches N or more x, first match more (greedy) x{n            } matches only n x x*?            Match 0 or more X, priority match less (non-greedy) x+?            Match one or more X, priority match less (not greedy) x??        Match 0 or one x to match 0 (non-greedy) x{n,m} first?         Match N to M x, first match less (not greedy) x{n,}?          Match n or more x, with less priority (not greedy) x{n}? Matches only n x------------------------------groupings: (subexpression) The group that is captured, which is numbered (sub-match) (?). p< named > subexpression) captured group, the group is numbered and named (sub-match) (?: subexpression) Non-capturing group (sub-match) (? tag) set tag within group, non-capture, tag affects         Regular expressions after the current group (? Tags: subexpression) Set tags within groups, non-captures, tags that affect sub-expression markers within the current group are: XYZ (set xyz mark)-xyz (clear xyz Mark) Xy-z (set xy mark, clear Z Mark) you can set the tag to have: I is not case sensitive (default is False) m multiline mode: let ^ and $ match the beginning and end of the entire text, rather than the beginning and ending of the line (default to False) s let.       Match \ n (default = False) U       Non-greedy mode: Swap x* and x*?        The meaning of the (default = false)------------------------------position tag: ^ If the tag m=true matches the beginning of the line, it will match the start of the entire text (the M defaults to false)             $ If the tag m=true matches the end of the line, otherwise the ending of the entire text (m defaults to false) \a matches the beginning of the entire text, ignoring the m mark \b             Matches the word boundary \b matches the non-word boundary \z matches the end of the entire text, ignoring the M tag------------------------------escape sequence: \a Match Bell character (equivalent to \x07) Note: \b Matching backspace cannot be used in regular expressions, because \b is used to match word boundaries, you can use \        X08 represents BACKSPACE.        \f Match a page break (equivalent to \x0c) \ t matches a horizontal tab (equivalent to \x09) \ n matches a newline character (equivalent to \x0a) \ r matches the carriage return (equivalent to \x0d) \v matches the Vertical tab (equivalent to \x0b) \123 matches the character represented by the 8 encoding (must be 3         Digit) \x7f Match 16 The character represented by the encoding (must be 3 digits) \x{10ffff} matches the character represented by the 16 encoding (maximum 10FFFF) \q...\e Matches the text between \q and \e, ignoring the regular syntax in the text \ \ match character \ \^ match character ^ \$             Matches the character $ \.        Matches the character.             \* Match character * \+ match character + \?        Match character?             \{match character {\} match character} \ (match character (\) match character) \[             Match character [\] Match character] \| Match character |------------------------------can use named character class as an element of the character class: [\d] matches a number (equivalent to \d) [^\d] matches a non-numeric        (equivalent to \d) [\d] matches a non-numeric (equivalent to \d) [^\d] matching number (equivalent to \d) [[: Name:]] named "ASCII class" contained in "character class" (equivalent to [: Name: ]) [^[:name:]] named "ASCII class" is not included in the "character class" (equivalent to [: ^name:]) [\p{name}] named "Unicode class" contained in "character class" (equivalent to \p{name }) [^\p{name}] named "Unicode class" is not included in the "character class" (equivalent to \p{name})----------------------------------------------------------                  ---Description:------------------------------the "character class" value as follows ("character class" contains "Perl class", "ASCII class", "Unicode Class"): x single character A-Z        Character range (contains the kinsoku characters) \ lowercase letters    Perl class [: ASCII class name:] ASCII class \p{unicode script class name} Unicode class (script Class) \punicode generic class name Unicode class (normal Class)-------------- ----------------"Perl class" values are as follows: \d number (equivalent to [0-9]) \d non-numeric (equivalent to [^0-9]) \s Blank (quite [\t\n\f\r]) \s non-whitespace (equivalent to [^\t\n\f\r]) \w Word character (equivalent to [0-9a-za-z_]) \w non-word character ( equivalent to [^0-9a-za-z_])------------------------------"ASCII class" values are as follows [: Alnum:] Alphanumeric (equivalent to [0-9a-za-z]) [: Alpha:] Word Female (equivalent to [a-za-z]) [: ASCII:] ASCII character set (equivalent to [\x00-\x7f]) [: blank:] Empty placeholder (equivalent to [\ t]) [: Cntrl:] control character    (equivalent to [\x00-\x1f\x7f]) [:d Igit:] Number (equivalent to [0-9]) [: graph:] graphic character (equivalent to [!-~]) [: lower:] lowercase letter (equivalent to [A-z]) [:p rint:] printable Word    [-~] equivalent [[: Graph:]]) [:p UNCT:] punctuation (equivalent to [!-/:-@[-inverted quotation mark {-~]) [: space:] white space character (equivalent to [\t\n\v\f\r]) [: Upper:] Capital letter (equivalent to [A-z]) [: Word:] Word character (equivalent to [0-9a-za-z_]) [: xdigit:] 16 import Character set (equivalent to [0-9a-fa-f])------------------------------the "Unicode class" value as follows---Normal class: C-Other-(other) Cc control characters    (Control) CF format Co Private use zone (privately used) Cs Agent zone (                Surrogate) L-letters-(letter) Ll lowercase letters (lowercase letter) Lm Modifier letter (modifier letters) Lo other letters (other letter) Lt initial capital letter (Titleca                Se letter) LU Capital Letters (uppercase) M-mark-(Mark) Mc Spacing mark (spacing mark) Me close mark (enclosing mark) Mn non-spacing mark (Non-spac                ing mark) N-numbers-(number) Nd 10 (decimal number) Nl Alphanumeric (letter number) No additional digits (other number) P-punctuation-(punctuat ION) Pc                Connector punctuation (connector punctuation) Pd dash punctuation (dash punctuation) Pe off Closed punctuation (Close punctuation) Pf The last punctuation (final punctuation) Pi initial punctuation (initial punct                 uation) Po Other punctuation (other punctuation) Ps open punctuation (open punctuation) S -Symbol-(symbol) SC currency symbol (currency symbol) Sk modifier symbol (modifie                 R symbol) Sm math symbol (math symbol) so other symbols (other symbol) Z -Delimiter-(separator) Zl line delimiter (lines separator) Zp paragraph delimiter (paragraph                   Separator) Zs blank delimiter (space separator)------------------------------the "Unicode class" value as follows---script class: Arabic    Arabic Armenian Armenian Balinese Bali Bengali Bengali               Bopomofo Hanyu Pinyin Braille Braille Buginese Buginese Buhid Buhid Canadian_aborigina                  L Canadian soil blankly Carian Kariyaven Cham Cham Cherokee-Cherokee Common Normal, character is not specific to a script Coptic Coptic cuneiform cuneiform Cypriot plug                Lusvin Cyrillic Slavic Deseret Utah State Devanagari Sanskrit Ethiopic                   Clothed Sobiavin Georgian Georgian glagolitic Glagolitic Gothic Deseret Greek                  Greek Gujarati Gujarati Gurmukhi Fruit Gurmukhi Han Chinese Hangul Korean Hanunoo Harlem Hebrew Hebrew hiragana Hiragana (Japanese) in                Herited script that inherits the previous character Kannada Kannadavin katakana Katakana (Japanese) Kayah_li Kayah Letter Kharoshthi             Carlo Khmer Khmer Lao Lao Latin Latin Lepcha Rebchavin Limbu Limbu linear_b B Linear type (Ancient Greece) Lycian Cala Brandinchi                 Wen Lydian Lydian malayalam malayalam Mongolian Mongolian Myanmar                Burmese new_tai_lue new Dai le wen Nko Nko ogham ogham Ol_chiki                 Santaliven old_italic Ancient Italian Old_persian Guboswin Oriya Oriya Osmanya Osmanya Phags_pa Eight think Phags Phoenician Phoenician rejang Lajenvin Run                 IC Ancient Nordic writing Saurashtra Sorastraven (Indian county) Shavian Shavian Navan Sinhala                 Sinhalese Sundanese Sunda Syloti_nagri Silkhteven Syriac Syriac Tagalog Tagalavin TagbaNWA Tagbanwa Tai_le, dai Tamil tamil telugu telugu Thaana Thaana Thai Thai Tibetan Tibetan Tifinagh non-turner Wen Ugaritic ugaritic Vai vai Yi Yi--------------------------- ---------------------------------Note: For regular expressions such as [A-z], if you want to match in [], you can place-at the beginning or end of [], for example [-a-z] or [a-z-], you can use escape in [] Characters: \f, \ t, \ n, \ r, \v, \377, \xff, \x{10ffff}, \ \, \^, \$, \, \*, \+, \, \{, \}, \ (, \), \[, \], \| (see above for details) if grouping is used in regular expressions, When you perform a regular replacement, you can use the group reference "${1}, $name, ${name} to get the appropriate grouping content in the" replacement content ".　　Where "$" represents the entire match, $ $ represents the 1th group, and the $ $ represents the 2nd group ....　　If the "group reference" is $name form, the name is taken as long as possible when parsing, such as: $1x equivalent to ${1x}, rather than ${1}x, such as: $ $ equivalent to ${10}, rather than ${1}0.　　Because the $ character is escaped, using the $ character in replace content can be replaced with \$. The regular expression syntax described above is "Perl syntax", in addition to "Perl syntax", there is another "POSIX syntax" in the Go language, "POSIX syntax" in addition to the use of "Perl class", the others are the same. ------------------------------------------------------------//example func main () {text: = ' Hello world! 123 Go. '//Find continuous lowercase letter reg: = RegExp. Mustcompile (' [a-z]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Ello" "O"]//find continuous non-lowercase letter reg = RegExp. Mustcompile (' [^a-z]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["H" "World!" 123 G "". " Find consecutive word Letters REG = RegExp. Mustcompile (' [\w]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello" "123" "Go"]//find consecutive non-word letters, non-whitespace characters reg = RegExp. Mustcompile (' [^\w\s]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["World! " "."] Finds consecutive uppercase letters, Reg = RegExp. Mustcompile (' [[: upper:]]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["H" "G"]//find continuous non-ASCII character reg = RegExp. Mustcompile (' [[: ^ascii:]]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["World! "]//finds consecutive punctuation marks Reg = RegExp. Mustcompile (' [\pp]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["! " "."] Finds consecutive non-punctuation characters, Reg = RegExp. Mustcompile (' [\pp]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello World" "123 Go"]//find continuous Kanji reg = RegExp. Mustcompile (' [\p{han}]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["World"]//find consecutive non-kanji characters reg = RegExp. Mustcompile (' [\p{han}]+ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello"]! 123 Go. "] Look for Hello or Goreg = RegExp. Mustcompile (' hello| Go ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello" "Go"]//find the beginning of the line with H, the string ending with a space Reg = RegExp. Mustcompile (' ^h.*\s ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello World! 123 "]//finds the beginning of the line starting with H, with whitespace ending with a string (non-greedy mode) Reg = RegExp. Mustcompile (' (? U) ^h.*\s ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello"]//look for a string that begins with a hello (ignoring case), and a Go to end with the strings Reg = RegExp. Mustcompile (' (? i:^hello). *go ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello World! 123 Go "]//Find Go.reg = RegExp. Mustcompile (' \qgo.\e ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Go."] Finds the string that ends with a space starting at the beginning of the line (non-greedy mode), Reg = RegExp. Mustcompile (' (? U) ^.* ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello"]//lookup begins with a space, ends at the end of the line, and does not contain a space string, Reg = RegExp. Mustcompile (' [^]*$ ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Go."] Find the string between "word boundaries" Reg = RegExp. Mustcompile (' (? U) \b.+\b ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1)//["Hello" "World!") "" "123" "Go"]//finds non-whitespace characters 1 times to 4 times, and ends with an O with a string of Reg = RegExp. Mustcompile (' [^]{1,4}o ') fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello" "Go"]//find Hello or Goreg = RegExp. Mustcompile ('?: hell| G) fmt. Printf ("%q\n", Reg. Findallstring (text,-1))//["Hello" "Go"]//find Hello or go, replace with hellooo, Goooreg = RegExp. Mustcompile (' (? phell| G) fmt. Printf ("%q\n", Reg. Replaceallstring (text, "${n}ooo"))//"Hellooo world!" 123 gooo. " Exchange Hello and Goreg = RegExp. Mustcompile (' (Hello) (. *) (Go) ') fmt. Printf ("%q\n", Reg. Replaceallstring (text, "$3$2$1")//"Go world!" 123 Hello. " Find the special character Reg = RegExp. Mustcompile (' [\f\t\n\r\v\123\x7f\x{10ffff}\\\^\$\.\*\+\?\{\}\ (\) \[\]\|] ') fmt. Printf ("%q\n", Reg. Replaceallstring ("\f\t\n\r\v\123\x7f\u0010ffff\\^$.*+?{} ()[]|", "-"))// "----------------------"}------------------------------------------------------------
This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.
A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service
Get Started for Free
Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More
Basic knowledge-Regular expressions in Golang

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support