How do I split a string that contains multiple delimiters?
Actual case
We want to split a string by a separate character segment, which contains a number of different delimiters, such as:
s = ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD '
All of which <,>,<;>,<|>,<\t>
are delimiters, how to handle?
Solution
Continuous use split()
method, each processing a separator
# using Python2 def mysplit (s,ds): res = [s] for D-ds:t = [] Map (Lambda x:t.extend (X.split (d)), res) res = t return [x FO R x in res if x] s = ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD ' result = Mysplit (s, ';, |\t ') print (result)
C:\users\administrator>c:\python\python27\python.exe E:\python-intensive-training\s2.py [' ASD ', ' aad ', ' dasd ', ' DASD ', ' sdasd ', ' asd ', ' Adas ', ' sdasd ', ' asdasd ', ' d ', ' ASD '
To split a string at a single point using the method of a regular expression re.split()
>>> Import re >>> re.split (' [,; \t|] + ', ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD ') [' ASD ', ' aad ', ' dasd ', ' dasd ', ' sdasd ', ' asd ', ' Adas ', ' sdasd ', ' asdasd ', ' d ', ' ASD ']
How do you tell if a string a starts or ends with a string B?
Actual case
If a directory has the following files:
quicksort.c graph.py Heap.java install.sh stack.cpp ...
Now you need to give .sh
and .py
end executable permissions on the folder
Solution
Using the startswith()
and endswith()
methods of strings
>>> import OS, stat >>> os.listdir ('./') [' Heap.java ', ' quicksort.c ', ' stack.cpp ', ' install.sh ', ' graph.py '] >>> [name for name in Os.listdir './') if Name.endswith ('. Sh ', '. Py ')] [' install.sh ', ' graph.py '] &G T;>> os.chmod (' install.sh ', Os.stat (' install.sh '). St_mode | Stat. S_IXUSR)
[root@iz28i253je0z t]# ls-l install.sh-rwxr--r--1 root 0 Sep 18:13
How do I adjust the formatting of text in a string?
Actual case
A log file of a software in which the date format is yyy-mm-dd
:
2016-09-15 18:27:26 statu unpacked python3-pip:all 2016-09-15 19:27:26 statu half-configured python3-pip:all 2016-09-15 2 0:27:26 statu installd python3-pip:all 2016-09-15 21:27:26 Configure Asdasdasdas:all Python3-pip:all
You need to change the date to the United States date format mm/dd/yyy
, 2016-09-15 --> 09/15/2016
and how do you deal with it?
Solution
Use regular expression methods to re.sub()
do string substitution
Capturing groups of regular expressions capture each part of the content, in the order of each capturing group in the replacement string.
>>> log = ' 2016-09-15 18:27:26 statu unpacked Python3-pip:all ' >>> import re # in order >>> re.sub (' (\d{4})-(\d{2})-(\d{2}) ', R ' \2/\3/\1 ', log) ' 09/15/2016 18:27:26 statu unpacked ' # use regular expression groupings Python3-pip:all T Re.sub ('? P<YEAR>\D{4})-(? P<MONTH>\D{2})-(? P<day>\d{2}) ', R ' \g<month>/\g<day>/\g<year> ', log ' 09/15/2016 18:27:26 statu unpacked Python3-pip:all '
How to stitch multiple small strings into a large string?
Actual case
When designing a network program, we customize a UDP based network protocol that passes a series of parameters to the server in a fixed order:
Hwdetect: "<0112>" gxdepthbits: "<32>" gxresolution: "<1024x768>" Gxrefresh: "<60>" Fullalpha: "<1>" loddist: "<100.0>" Distcull: "<500.0>"
In the program we collect each parameter in the list in order:
["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0>", "<500.0>"]
In the end we are going to stitch each parameter into a packet to send:
"<0112><32><1024x768><60><1><100.0><500.0>"
Solution
Iteration list, continuous use ' + ' operation to stitch each string in sequence
>>> for N in ["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0>", " <500.0> "]: ... result = n ... >>> result ' <0112><32><1024x768><60><1> <100.0><500.0> '
Use str.join()
methods to stitch all strings in a list more quickly
>>> result = '. Join (["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0 > "," <500.0> "]) >>> result ' <0112><32><1024x768><60><1><100.0 ><500.0> '
If there are numbers in the list, you can use the Builder to convert:
>>> Hello = [222, ' SD ', 232, ' 2e ', 0.2] >>> '. Join (str (x) for x in Hello) ' 222sd2322e0.2 '
How do I align the string to the left, right, and center?
Actual case
A series of attribute values are stored in a dictionary:
{' IP ': ' 127.0.0.1 ', ' blog ': ' www.anshengme.com ', ' title ': ' Hello world ', ' Port ': ' 80 '}
In the program, we want to output its content in the following format, how do we handle it?
ip:127.0.0.1 blog:www.anshengme.com Title:hello World port:80
Solution
To align the str.ljust()
str.rjust,str.cente()
left and right centers using strings
>>> info = {' IP ': ' 127.0.0.1 ', ' blog ': ' www.anshengme.com ', ' title ': ' Hello world ', ' Port ': ' 80 '} # Get the maximum length of the keys in the dictionary >>> Max (Map (len, Info.keys ())) 5 >>> w = max (map (len, Info.keys ())) >>> for k in info: ... prin T (K.ljust (W), ': ', Info[k]) ... # get results port:80 blog:www.anshengme.com ip:127.0.0.1 World
Using format()
methods, passing similar ' <20 ', ' >20 ', ' ^20 ' parameters to complete the same task
>>> for k in info: ... print format (k, ' ^ ' +str (w)), ': ', info[k] ... port:80 blog:www.anshengme.com ip:127.0 .0.1 Title:hello World
How do you remove unwanted characters from a string?
Actual case
Filter out extra white space characters after user input card: anshengm.com@gmail.com
Filter ' \ r ' in edited text in Windows: Hello word\r\n
Remove the Unicode combination (tone) from the text: ' Níhǎo, Chīfàn '
Solution
String strip()
, lstrip(),rstrip()
method to remove characters at both ends of the string
>>> email = ' anshengm.com@gmail.com ' >>> email.strip () ' anshengm.com@gmail.com ' >>> Email.lstrip () ' anshengm.com@gmail.com ' >>> email.rstrip () ' anshengm.com@gmail.com ' >>>
Delete a fixed position of characters, you can use the Slice + stitching method
>>> S[:3] + s[4:] ' abc123 '
replace()
method or regular expression of a string re.sub()
deletes any position character
>>> s = ' \tabc\t123\txyz ' >>> s.replace (' t ', ') ' abc123xyz '
Use to re.sub()
Delete multiple
>>> Import re >>> re.sub (' [\t\r] ', ', ', String ') ' ABC123XYZOPQ '
String translate()
method, you can delete many different characters at the same time
>>> Import string >>> s = ' abc123xyz ' >>> s.translate (String.maketrans (' abcxyz ', ' xyzabc ')) ' Xyz123abc '
>>> s = ' \rasd\t23\bads ' >>> s.translate (None, ' \r\t\b ') ' Asd23ads '
# python2.7 >>> i = U ' níhǎo, chīfàn ' >>> i u ' ni\u0301 ha\u030co, chi\u0304 fa\u0300n ' >>> I.translate (Dict.fromkeys ([0x0301, 0x030c, 0x0304, 0x0300])) u ' ni hao, chi fan '
Summarize
The above is for you to organize the Python string processing skills, the article through cases, solutions and examples to demonstrate how to solve, for everyone to learn or use Python has a certain reference value. The need for reference can be referenced.
Read more about Python-related content: "Python string Operation tips Summary", "Python Code Operation tips Summary", "Python picture operation Tips", "python data structure and Algorithms tutorial", "python A summary of socket programming tips, a summary of Python function usage tips, Python introductory and Advanced classic tutorials, and Python file and directory how-to tips