How does python use the forward and backward search modes of regular expressions and the negative modes of forward search?
Preface
In many cases, many Matching content may or may not appear together. Such as '", <>, there is no use of half of the brackets. Therefore, the regular expression also has a consistent judgment, either two angle brackets appear together, or one does not appear. How can we achieve this? In this case, we need to introduce the new regular expression syntax :(? = Pattern). This syntax searches forward or backward for relevant content. If it does not appear, it cannot match. However, this match does not consume any input characters. It just needs to be checked.
Example:
# Python 3.6 # Cai junsheng # http://blog.csdn.net/caimouse/article/details/51749579 # import re address = re. compile (''' # A name is made up of letters, and may include ". "# for title abbreviations and middle initials. ((? P <name> ([\ w.,] + \ s +) * [\ w.,] +) \ s +) # name is no longer optional # LOOKAHEAD # Email addresses are wrapped in angle brackets, but only # if both are present or neither is. (? = (<. *> $) # Remainder wrapped in angle brackets | ([^ <]. * [^>] $) # remainder * not * wrapped in angle brackets) <? # Optional opening angle bracket # The address itself: username@domain.tld (? P <email> [\ w \ d. +-] + # username @ ([\ w \ d.] + \.) + # domain name prefix (com | org | edu) # limit the allowed top-level domains)>? # Optional closing angle bracket ''', re. VERBOSE) candidates = [u'first Last <first.last@example.com> ', u'no Brackets first.last@example.com', u'open Bracket <first.last@example.com ', u'close Bracket first.last@example.com>',] for candidate in candidates: print ('candidate: ', Candidate) match = address. search (candidate) if match: print ('name: ', match. groupdict () ['name']) print ('email: ', match. groupdict () ['email ']) else: print ('no Match ')
The output is as follows:
Candidate: First Last <first.last@example.com> Name : First Last Email: first.last@example.comCandidate: No Brackets first.last@example.com Name : No Brackets Email: first.last@example.comCandidate: Open Bracket <first.last@example.com No matchCandidate: Close Bracket first.last@example.com> No match
Use a regular expression in python to search for a negative pattern.
Learn the forward search or Backward Search mode (? = Pattern). In this mode, we can see that there is equal sign =, which indicates that it must be equal. In fact, there are still unequal judgments in the forward search mode. For example, you need to identify the e-mail address: noreply@example.com, this e-mail address is mostly don't need to reply, so we need to identify this e-mail address, and lose it. What should we do? In this case, you need to use the forward search negation mode. Its syntax is as follows :(?! Pattern), here the exclamation point is to indicate not, do not need to mean. For example, encounter such a string: noreply@example.com, it will determine whether the noreply @ is the same, if the same, it will lose this pattern recognition, no longer match.
Example:
# Python 3.6 # Cai junsheng # http://blog.csdn.net/caimouse/article/details/51749579 # import re address = re. compile (''' ^ # An address: username@domain.tld # Ignore noreply addresses (?! Noreply @. * $) [\ w \ d. +-] + # username @ ([\ w \ d.] + \.) + # domain name prefix (com | org | edu) # limit the allowed top-level domains $ ''', re. VERBOSE) candidates = [u'first. last@example.com ', U' noreply @ example.com',] for candidate in candidates: print ('candidate: ', Candidate) match = address. search (candidate) if match: print ('match: ', candidate [Match. start (): match. end ()]) else: print ('no Match ')
The output is as follows:
Candidate: first.last@example.com Match: first.last@example.comCandidate: noreply@example.com No match
Summary
The above is all the content of this article. I hope the content of this article has some reference and learning value for everyone's learning or work. If you have any questions, please leave a message to us, thank you for your support.