problem to solve: to match a string, the case of the letter in the string is not deterministic, how to match?
The problem is preceded by the use of a string comparison, such as to match ' abc ', the phrase:
1 if ' ABC ' : #s为需要匹配的字符串 2 Print ' match succeeded \ n '
Now the problem is that s may be ABC, ABC and so on, so the need for case-insensitive matching, if the need to match the size of the pattern to list, even if the three-letter short mode is also very troublesome, checked, the regular expression re module has a parameter flags=re. I, this makes it possible to match the case insensitive, as shown in the following example:
1 Import Re 2 3 ' ABC ' 4 ' ABC ' 5 p = re.compile (p,re. I)6print re.search (p,s). Group ()
Match succeeded, output: ' ABC '
The use of the compile function in the RE module:
Precompilation is not required in a Python regular match, but it is best to precompile, improve efficiency (multiple use, timely cache, and save time for cache checking). The optional range of the flag parameter values in the RE module is the following table:
(Document RE module can be checked)
Sign |
Meaning |
Dotall, S |
Make. Match all characters, including line breaks |
IGNORECASE, I |
Make the match case insensitive |
LOCALE, L |
Do localization identification (locale-aware) matching |
MULTILINE, M |
Multiline match, affecting ^ and $ |
VERBOSE, X |
Ability to use REs's verbose state to make it easier to understand |
The value of the flags parameter in this table can be used directly with the search function, such as Re.searchi (Pattern,string,flags)
Next question, how to match the attribute name in an XML element with an indeterminate case:
Since it can be insensitive to match the case of the letter, then if you want to match the XML element node, how do you want to get the value of that node?
The idea is this: to get the property value, you want to get the property name, but how exactly to know the attribute name which letter uppercase which letter lowercase, this is a problem
The method is to use the element tag to find the element, and then take out all the attribute names of the element, match each one, find the desired stop, so that the matching element property name succeeds, Re.search (p,s,f). Group () is the property's current name.
The XML file that needs to be parsed (Abc.xml) is as follows:
1 <Root>2 <elementname= ' Who '/element1>3 <elementName= ' am '/element1>4 <elementNAME= ' I '/element1>5 </Root>
The parsed code is as follows:
1 ImportRe2 ImportXml.etree.Element as Etree3 4File ='Abc.xml'5p ='name'6Pattern =Re.compile (p,re. I)7Tree =etree.parse (file)8Root =tree.getroot ()9result = []Tenresult = Tree.findall ('.//element') One forIinchResult: A forJinchI.attrib.keys (): - Try: -R =Re.search (pattern,j). Group () the #Output matching Name property name and corresponding property value - Print 'attrib is%s,and , the value is%s\n'%r%I.attrib[r] - Break - exceptattributeerror,e: + Pass -
The results of the output are as follows:
1 is name, and are who2 is name, and is am 3 is NAME, and is I
Last recorded: the Str.strip () parameter is empty can remove the string of special characters, very useful ~
Python matches the application of letter case insensitive in read XML