Summary of Python string segmentation method

Source: Internet
Author: User

A common method of string segmentation in Python is to call a str.split method of a string directly, but it can specify only one delimiter, and the re.split method (the split method of the regular expression) is required if you want to specify multiple separators to split the string.

Str.split

The Split method function of the string is prototyped as follows, where Sep is the specified delimiter and maxsplit is the maximum number of splits:

1 Str.split (sep=None, maxsplit=-1)

By default, separators are split with whitespace characters (spaces, carriage returns, tabs, and so on) when no delimiter is specified:

1234 >>> s = ' A b\tc\nd ' >>> s.split () [' A ', ' B ', ' C ', ' D ']>>>

In the results list, the empty string is not included:

1234 >>> s = ' A b\tc\nd\n\n ' >>> s.split () [' A ', ' B ', ' C ', ' D ']>>>

Specify delimiter:

1234567 >>> s = ' www.google.com ' >>> s.split ('. ') [' www ', ' Google ', ' com '] >>> s = ' aa| | bb| | cc| | DD '>>> s.split (' | | ') [' AA ', ' BB ', ' CC ', ' DD ']>>>

Specify the maximum number of splits:

1234567 >>> s = ' www.google.com ' >>> s.split ('. ', 1) [' www ', ' google.com '] >>> s = ' aa| | bb| | cc| | DD '>>> s.split (' | | ', 2)[' AA ', ' BB ', ' cc| | DD ']>>>

Thus, when the maximum number of splits is specified maxsplit , the result list length is maxsplit+1 .
However, the split method of the string can specify only one delimiter, as follows:

1 s = ' aaaa,bbbb:cccc;dddd '

If you want to specify commas, colons, and semicolons as delimiters, the split method of the string is not available, and the Split method in the regular expression is used.

Re.split

The split method of the regular expression is prototyped as follows, where pattern is the specified delimited regular expression, string is the character to be split, Maxsplit is the maximum number of splits, and flags is the generic flag used by the regular expression:

1 Re.split (Pattern, string, maxsplit=0, flags=0)

Reference Example:

12345 >>> import re >>> s = ' aaaa,bbbb:cccc;dddd ' >>> re.split (R ' [,:;] ', s) [' AAAA ', ' BBBB ', ' CCCC ', ' DDDD ']>>>

If you use a capturing group in a regular expression that is parentheses, the resulting list also contains the captured content:

12345 >>> import re >>> s = ' aaaa,bbbb:cccc;dddd ' >>> re.split (R ' ([,:;]) ', s) [' AAAA ', ', ', ' BBBB ', ': ', ' CCCC ', '; ', ' DDDD '] >>>

If you do not want to see the delimiter in the results, but still want to group the regular expression pattern with parentheses, you can specify it in the form of a non-capturing group, (?:...) as in the following example:

12345 >>> import re >>> s = ' aaaa,bbbb:cccc;dddd ' >>> re.split (R ' (?: [,:;]) ', s) [' AAAA ', ' BBBB ', ' CCCC ', ' DDDD ']>>>

Specify the maximum number of splits:

1234567 >>> import re >>> s = ' aaaa,bbbb:cccc;dddd ' >>> re.split (R ' [,:;] ', S, 1) [' AAAA ', ' bbbb:cccc;dddd '] >>> re.split (R ' [,:;] ', S, 2) [' AAAA ', ' BBBB ', ' cccc;dddd ']>>>

Thus, when the maximum number of splits is specified maxsplit , the result list length is maxsplit+1 .
Specify a generic flag in a regular expression flags:

1234 >>> import re >>> re.split (' [a-f]+ ', ' 0a3b9 ', flags=re. IGNORECASE)[' 0 ', ' 3 ', ' 9 ']>>>

Original link: http://www.revotu.com/python-split-string-methods.html

Summary of Python string segmentation method

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.