Today encountered a regular expression regex = ' <div class= ' div_result[\s\s]+?> ([\s\s]+) </div> '
I thought I would match it to the content of the Web page and get a complete string like <div class= "div_result...</div>, but the results only get the content between <div></div>, very puzzled , after the internet to find out, the original brackets have access to match the role:
(pattern) ——— match pattern and get this match.
In addition to representing 0 or one, you can suppress greedy matches by default, greedy matches (the more matches, the better), if there are multiple patterns at the same time greedy match, the final result is the result of competing compromises.
ImportRedefgetregresults (Reg, data): Pattern=Re.compile (reg) Resultlists=Re.findall (pattern, data)returnresultlistsif __name__=='__main__': S="abcd_123e FG hk456"Reg='abc.+ ([\s\s]+?) \d+'REG2='([\s\s]+?)'reg3='([\s]?)'REG4='([\s\s]+?)'Reg5='([\s]+)' PrintGetregresults (Reg, s)PrintGetregresults (REG2, s)PrintGetregresults (REG5, s)PrintGetregresults (reg3, s)PrintGetregresults (REG4, s)
Reference
Http://www.cnblogs.com/yirlin/archive/2006/04/12/373222.html
Http://www.cnblogs.com/graphics/archive/2010/06/02/1749707.html
Matching the parentheses of regular expressions with greedy