Cuiqingcai "Python3 Network crawler development Combat" finishing
Greed and non-greed
Import'Hello 12345678 word_this is a Regex Demo'= Re.match (' ^he.* (\d+). *demo$', content)print(Result.group (1))
Originally intended to take out 12345678, but
Run Result: 8
Greedy match pattern:. * Matches as many characters as possible.
. * After (\d+) match at least one number, no specific number specified. So. * Match as many characters as possible, match 1234567, and give \d+ a result that satisfies only 8 of the conditions.
So, the final result is as far as 8.
Non-greedy pattern matching:. *? Match as few characters as possible and leave the rest to the back to match.
Add one after. *
Import'Hello 12345678 world_this is a Regex Demo'= Re.match (' /c10>^he.*? ( \d+). *demo$', content)print(Result.group (1))
Run Result: 12345678
So, when matching, the characters in the middle as far as possible non-greedy match, so as not to match the results of the actual situation. If a match results in a string result,. *? It is possible to match nothing, because it will match as few characters as possible.
Importrecontent='Http://weibo.com/comment/kEraCN'RESULT1= Re.match ('^h.*?comment/(. *?)', content) result2= Re.match ('^h.*?comment/(. *)', content)Print('RESULT1', Result1.group (1))Print('result2', Result2.group (2))
Python Regular Expression Supplement