Example 1. determines whether all strings are lowercase letters. find out the full spelling of acronyms in a sentence. remove the comma (for example, 123,345,000) in the number from the syntax regular character interpretation example + the previous element appears at least once AB +: AB, abbbb, etc. * The first element appears 0 or multiple times AB *: a, AB, abb, etc? Match the previous AB?: A, AB, etc. ^ as the start Mark ^ a: abc, aaaaaa, etc. $ as the end mark c $: abc, cccc, etc. \ d numbers 3, 4, 9 \ D non-numbers A, a,-and other [a-z] Any letters A, p, m between a and z for example 1. determine whether the string is in all lower-case code #-*-coding: cp936-*-import re s1 = 'adkdk 's2 = 'abc123efg' an = re. search ('^ [a-z] + $', s1) if an: print 's1: ',. group (), 'all lowercase 'else: print s1, "Not all lowercase! "An = re. match ('[a-z] + $', s2) if an: print 's2: ',. group (), 'all lowercase 'else: print s2, "Not all lowercase! "Investigate its cause 1. regular expressions are not part of python. When using them, you must Reference re Module 2. the matching format is re. search (Regular Expression with matching string) or re. match (Regular Expression with matching strings ). The difference between the two is that the latter starts with the start character (^) by default. Therefore, re. search ('^ [a-z] + $', s1) is equivalent to re. match ('[a-z] + $', s2) 3. if the match fails, an = re. search ('^ [a-z] + $', s1) returns the None group to group matching results, for example, import rea = "123abc456" print re. search ("([0-9] *) ([a-z] *) ([0-9] *)", ). group (0) #123abc456, returns the overall print re. search ("([0-9] *) ([a-z] *) ([0-9] *)", ). group (1) #123 print re. search ("([0-9] *) ([a-z] *) ([0-9] *)", ). group (2) # abcprint re. search ("([0-9] *) ([a-z] *) ([0-9] *)", ). group (3) #456 1) Regular Expression Group () and group (0) are the results of matching regular expressions. group (1) lists the Matching Parts of the first parentheses. group (2) list the matching part of the second bracket, and group (3) lists the matching part of the third bracket. 2) If no matching is successful, re. search () returns None. 3) Of course, Zheng's expression does not contain parentheses, and group (1) is definitely incorrect. II. expanded acronyms specific example: FEMA Federal Emergency Management AgencyIRA Irish Republican ArmyDUP sans Unionist Party FDA Food and Drug AdministrationOLC Office of Legal Counsel analysis the acronym FEMA is decomposed into F *** E *** M * ** A *** regular uppercase letters + lowercase letters (greater than or equal to 1) + space reference code import redef expand_abbr (sen, abbr): lenabbr = len (abbr) ma = ''for I in range (0, lenabbr ): ma + = abbr [I] + "[a-z] +" + ''print 'ma: ', ma = ma. strip ('') p = re. Search (ma, sen) if p: return p. group () else: return ''print expand_abbr ("Welcome to Algriculture Bank China", 'abc') the above Code is correct for the first three in the example, but the last two are wrong, because the words starting with an upper-case letter are also mixed with lower-case letters, and the upper-case letters + lower-case letters (greater than or equal to 1) + space + [lower case + space] (0 or 1 time) refer to the Code import redef expand_abbr (sen, abbr): lenabbr = len (abbr) ma = ''for I in range (0, lenabbr-1 ): ma + = abbr [I] + "[a-z] +" + ''+ '([a-z] + )? 'Ma + = abbr [lenabbr-1] + "[a-z] +" print 'ma: ', ma = ma. strip ('') p = re. search (ma, sen) if p: return p. group () else: return ''print expand_abbr ("Welcome to Algriculture Bank of China", 'abc') specifies a lowercase letter set + a space in the middle of the skill, add a bracket. Either at the same time or not at the same time. Is this required ?, Match the whole. III. remove the comma in the number. For example, when processing a natural language, 123,000,000 if it is separated by punctuation marks, a problem will occur. A good number will be broken down by commas, therefore, you can first clean the number (remove the comma ). Analysis numbers are often composed of three numbers followed by a comma. Therefore, the rule is: ***, ***, *** regular expression [a-z] +, [a-z]? Refer to code 3-1 import re sen = "abc, 123,456,789, mnp" p = re. compile ("\ d +, \ d +? ") For com in p. finditer (sen): mm = com. group () print "hi:", mm print "sen_before:", sen = sen. replace (mm, mm. replace (",", "") Tips to use the finditer function (string [, pos [, endpos]) | re. finditer (pattern, string [, flags]): searches for strings and returns an iterator that accesses each matching result (Match object) sequentially. Refer to code 3-2 sen = "abc, 123,456,789, mnp" while 1: mm = re. search ("\ d, \ d", sen) if mm: mm = mm. group () sen = sen. replace (mm, mm. replace (",", "") print sen else: break extends the program to address a specific problem, that is, a group of three digits. If the numbers are mixed with letters, remove the comma between numbers, that is, convert "abc, 123,4, 789, mnp" to "abc, 1234789, mnp". More specifically, find the regular expression "number, after finding the number, replace it with a comma. See code 3-3 sen = "abc, 123,4, 789, mnp" while 1: mm = re. search ("\ d, \ d", sen) if mm: mm = mm. group () sen = sen. replace (mm, mm. replace (",", "") print sen else: breakprint sen