This article shares with you the string processing skills in Python, this includes splitting strings that contain multiple separators, determining whether string a starts or ends with string B, and adjusting the text format in the string to concatenate multiple small strings into a large string, interested friends can learn by reading the following articles. This article shares with you the string processing skills in Python, this includes splitting strings that contain multiple separators, determining whether string a starts or ends with string B, and adjusting the text format in the string to concatenate multiple small strings into a large string, interested friends can learn by reading the following articles.
1. how to split strings containing multiple separators?
Actual case
We want to split a string into different character segments based on the separator. the string contains multiple separators, for example:
s = 'asd;aad|dasd|dasd,sdasd|asd,,Adas|sdasd;Asdasd,d|asd'
Where<,>,<;>,<|>,<\t>
All are delimiters. how can this problem be solved?
Solution
Continuous usesplit()
Method, processing a separator each time
# Use Python2 def mySplit (s, ds): res = [s] for d in ds: t = [] map (lambda x: t. extend (x. split (d), res) res = t return [x for x in res if x] s = 'asd; aad | dasd, sdasd | asd ,, adas | sdasd; Asdasd, d | asd 'result = mySplit (s, ';, | \ t') print (result)
C:\Users\Administrator>C:\Python\Python27\python.exe E:\python-intensive-training\s2.py ['asd', 'aad', 'dasd', 'dasd', 'sdasd', 'asd', 'Adas', 'sdasd', 'Asdasd', 'd', 'asd']
There.split()
Method to split strings at a time
>>> import re >>> re.split('[,;\t|]+','asd;aad|dasd|dasd,sdasd|asd,,Adas|sdasd;Asdasd,d|asd') ['asd', 'aad', 'dasd', 'dasd', 'sdasd', 'asd', 'Adas', 'sdasd', 'Asdasd', 'd', 'asd']
2. how can I determine whether string a starts or ends with string B?
Actual case
If a directory contains the following files:
quicksort.c graph.py heap.java install.sh stack.cpp ......
Now you need.sh
And.py
Executable permissions in the folder ending
Solution
Use stringstartswith()
Andendswith()
Method
>>> import os, stat >>> os.listdir('./') ['heap.java', 'quicksort.c', 'stack.cpp', 'install.sh', 'graph.py'] >>> [name for name in os.listdir('./') if name.endswith(('.sh','.py'))] ['install.sh', 'graph.py'] >>> os.chmod('install.sh', os.stat('install.sh').st_mode | stat.S_IXUSR)
[root@iZ28i253je0Z t]# ls -l install.sh -rwxr--r-- 1 root root 0 Sep 15 18:13 install.sh
3. how to adjust the text format in a string?
Actual case
A software log file, in which the date format isyyy-mm-dd
:
2016-09-15 18:27:26 statu unpacked python3-pip:all 2016-09-15 19:27:26 statu half-configured python3-pip:all 2016-09-15 20:27:26 statu installd python3-pip:all 2016-09-15 21:27:26 configure asdasdasdas:all python3-pip:all
You need to change the date format to the American date format.mm/dd/yyy
,2016-09-15 --> 09/15/2016
, What should I do?
Solution
Use regular expressionsre.sub()
Method for string replacement
Use a regular expression to capture a group, capture each part of the content, and replace the order of each capture group in the string.
>>> Log = '2017-09-15 18:27:26 statu unpacked python3-pip: all' >>> import re # in order >>> re. sub ('(\ d {4})-(\ d {2})-(\ d {2})', r' \ 2/\ 3/\ 1 ', log) '2017/09/15 18:27:26 statu unpacked python3-pip: all' # groups using regular expressions> re. sub ('(? P
\ D {4 })-(? P
\ D {2 })-(? P
\ D {2}) ', r' \ g
/\ G
/\ G
', Log)' 09/15/2016 18:27:26 statu unpacked python3-pip: all'
4. how to splice multiple small strings into a large string?
Actual case
When designing a network program, we customized a UDP-based network protocol and passed a series of parameters to the server in a fixed order:
hwDetect: "<0112>" gxDepthBits: "<32>" gxResolution: "<1024x768>" gxRefresh: "<60>" fullAlpha: "<1>" lodDist: "<100.0>" DistCull: "<500.0>"
In the program, we collect parameters in order to the list:
["<0112>","<32>","<1024x768>","<60>","<1>","<100.0>","<500.0>"]
Finally, we need to splice the parameters into a data packet for sending:
"<0112><32><1024x768><60><1><100.0><500.0>"
Solution
Iteration list, concatenate each string in sequence using the '+' operation
>>> for n in ["<0112>","<32>","<1024x768>","<60>","<1>","<100.0>","<500.0>"]: ... result += n ... >>> result '<0112><32><1024x768><60><1><100.0><500.0>'
Usestr.join()
Method to quickly splice all strings in the list
>>> result = ''.join(["<0112>","<32>","<1024x768>","<60>","<1>","<100.0>","<500.0>"]) >>> result '<0112><32><1024x768><60><1><100.0><500.0>'
If the list contains numbers, you can use the generator to convert them:
>>> hello = [222,'sd',232,'2e',0.2] >>> ''.join(str(x) for x in hello) '222sd2322e0.2'
5. how to align the left, right, and center of a string?
Actual case
A dictionary stores a series of attribute values:
{ 'ip':'127.0.0.1', 'blog': 'www.anshengme.com', 'title': 'Hello world', 'port': '80' }
In the program, how do we output the content in the following format?
ip : 127.0.0.1 blog : www.anshengme.com title : Hello world port : 80
Solution
Use stringstr.ljust()
,str.rjust,str.cente()
Align the left-right center
>>> Info = {'IP': '100. 0.0.1 ', 'blog': 'www .anshengme.com ', 'Title': 'Hello World', 'port ': '80'} # obtain the maximum length of keys in the dictionary >>> max (map (len, info. keys () 5 >>> w = max (map (len, info. keys () >>> for k in info :... print (k. ljust (w), ':', info [k])... # obtained result port: 80 blog: www.anshengme.com ip: 127.0.0.1 title: Hello world
Useformat()
METHOD. the parameter '<20','> 20', '^ 20' is passed to complete the same task.
>>> for k in info: ... print(format(k,^_^ _^`+str(w)), ':',info[k]) ... port : 80 blog : www.anshengme.com ip : 127.0.0.1 title : Hello world
6. how do I remove unnecessary characters from a string?
Actual case
Filter out excess white space characters after user input card: anshengm.com@gmail.com
Filter '\ r': hello word \ r \ n in the edited text in a windows environment
Remove the unicode composite symbols (tones) in the text: 'Ni Hei ha Yao o, chi HEI fa Jun n'
Solution
Stringstrip()
,lstrip(),rstrip()
Remove the characters at both ends of the string.
>>> email = ' anshengm.com@gmail.com ' >>> email.strip() 'anshengm.com@gmail.com' >>> email.lstrip() 'anshengm.com@gmail.com ' >>> email.rstrip() ' anshengm.com@gmail.com' >>>
To delete a character at a fixed position, you can use the slicing + splicing method.
>>> s[:3] + s[4:] 'abc123'
Stringreplace()
Method or regular expressionre.sub()
Delete arbitrary characters
>>> s = '\tabc\t123\txyz' >>> s.replace('\t', '') 'abc123xyz'
Usere.sub()
Delete multiple
>>> import re >>> re.sub('[\t\r]','', string) 'abc123xyzopq'
Stringtranslate()
You can delete multiple characters at the same time.
>>> import string >>> s = 'abc123xyz' >>> s.translate(string.maketrans('abcxyz','xyzabc')) 'xyz123abc'
>>> s = '\rasd\t23\bAds' >>> s.translate(None, '\r\t\b') 'asd23Ads'
# python2.7 >>> i = u'ní hǎo, chī fàn' >>> i u'ni\u0301 ha\u030co, chi\u0304 fa\u0300n' >>> i.translate(dict.fromkeys([0x0301, 0x030c, 0x0304, 0x0300])) u'ni hao, chi fan'
Summary
The above is the string processing skills in Python. This article uses cases, solutions, and examples to demonstrate how to solve the problem. it has some reference value for everyone to learn or use python. For more information, see.
For more articles about string processing techniques in Python, please follow the PHP Chinese network!