Raw (raw) strings in Python Faq3-python

Source: Internet
Author: User
Tags natural string expression engine

This article originates from the Py2.7.9-docs faq.pdf "3.23 Why can ' t raw strings (r-strings) end with a backslash?"

More precisely, the original string is an R-decorated string, and cannot end with an odd number of backslashes;

The original string is designed to be used as input to some processors (primarily the regular expression engine). Such a processor would consider this unmatched end backslash to be an error, so the original string would not be allowed to end with an odd number of backslashes. In turn, they allow you to use slashes to represent escapes, including \ "Representations", \ t for tab, and so on. This rule applies when the original string is used for these processors.

If the original string is not used for a processor such as a regular expression, but simply represents a string, then \ is \ in the string, and no longer has the meaning of escaping, which is called ' primitive '.

Next I'll explain the difference between a string and the original string step-by-step

1. For a separate string representation:

The \ Escape behavior exists in a simple string, and \ n is the \ n character in the original string

>>> s = "I Have\na Dream" >>> r = R ' I have\na Dream ' >>> print si havea dream>>> print R I have\na Dream

2. Raw string used in regular expressions

We use the Windows path to do an example of an escape of the original string

>>> path = r "\this\is\a\path\"   File  "<stdin>",  line 1     path = r "\this\is\a\path\"      #原始字符串不允许单数个 \ End, Whether it's for regular or regular strings                               ^SyntaxError: EOL  While scanning string literal>>> path = r "\this\is\a\path\ " [:-1]  >>> path ' \\this\\is\\a\\path\\ '         # Defines a string to match >>> reg1 = r ' \\this\\is\\a\\path\\ '   #定义了自然字符串表示的正则表达式 >>>  import re>>> g = re.match (Reg1, path)     # Use natural strings to match >>> print g.group () \this\is\a\path\                 #匹配到The result, indicating that the real \ character can be matched with \ \ on a natural string >>>             The result of                 #\\ escaping is \

3. Simple strings are used in regular expressions

Let's use the path variable above to create an example of a simple string to match

>>> reg2 =  ' \\this\\is\\a\\path\\ ' >>> g = re.match (reg2,  Path)           #竟然报异常了, according to the exception means that the end of the line is a false escape traceback  (most  Recent call last):   #下面我们再探究原因, first remove the end of the line \ \, again to match   File  "<stdin>",  line 1, in <module>  File  "D:\Python27\lib\re.py", line  137, in match    return _compile (pattern, flags). Match (String)    File  "D:\Python27\lib\re.py", line 244, in _compile     raise error, v # invalid expressionsre_constants.error: bogus escape  ( End of line) >>> reg2 =  ' \\this\\is\\a\\path '     >> > g = re.match (Reg, path)          # According to the original string understanding, this should be able to match on the, but no >&Gt;> print g.group () traceback  (most recent call last):  File  " <stdin> ", line 1, in <module>attributeerror:  ' Nonetype '  object  has no attribute  ' Group '

Why are there differences, and why do I use R ' strings ' when we suggest regular matches everywhere?

Let's analyze the difference between a primitive string and a simple string: a simple string if you want to output ' \ ', you need to escape the ' \ \ ' to output a ' \ '; The original string wants to output ' \ ', then write directly ' \ '.

Here is a bit of a mess, I think the main reason is str, repr in trouble:

>>> print path #这里调用str, the way people used to display \this\is\a\path>>> path # Here call Repr, the real display way (more than the display of Str only a layer of escape) ' \\this\\is\\a\\path\\ '

Let's all take the real display as a reference, i.e.

The true display of path is: ' \\this\\is\\a\\path\\ '

The true display of the regular expression reg2 of a simple string is: ' \\this\\is\\a\\path '

The true display of the regular expression reg1 of the original string is: ' \\\\this\\\\is\\\\a\\\\path\\\\ '

It is easier to understand the match from the actual display, and without the original and simple string, it is seen as a string of regular engine applications. From the above can be seen in REG2 \ \ can only match \, and path is \ \, need to like REG1 in the \\\\ to match.

It is always easier to remember to use rules, match path \ character, need ordinary string input 4 slash (\\\\) match, and original string only need 2 slash (\ \) to match on. This is also why it is encouraged to use the original string for regular matching.




This article is from the "Nameless" blog, please be sure to keep this source http://xdzw608.blog.51cto.com/4812210/1607504

Raw (raw) strings in Python Faq3-python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.