This article shares the processing techniques of strings in Python, including splitting strings with multiple delimiters, judging whether string a starts or ends with string b, resizing text in a string, and stitching multiple small strings into a large string. Friends who are interested can learn by reading the following.
How do I split a string that contains multiple separators?
Practical cases
We're going to split a string by separating a different segment of characters, which contains a number of different delimiters, for example:
s = ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD '
All of them <,>,<;>,<|>,<\t>
are separators, how to handle them?
Solution Solutions
Continuous use split()
method, one separator per processing
# using Python2 def mysplit (s,ds): res = [s] for d in ds:t = [] Map (Lambda x:t.extend (X.split (d)), res) res = t return [x FO R x in res if x] s = ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD ' result = Mysplit (s, ';, |\t ') print (result)
C:\users\administrator>c:\python\python27\python.exe E:\python-intensive-training\s2.py [' ASD ', ' aad ', ' dasd ', ' DASD ', ' sdasd ', ' asd ', ' Adas ', ' sdasd ', ' asdasd ', ' d ', ' ASD '
Use regular expression re.split()
methods to split a string at once
>>> Import re >>> re.split (' [,; \t|] + ', ' asd;aad|dasd|dasd,sdasd|asd,,adas|sdasd; ASDASD,D|ASD ') [' ASD ', ' aad ', ' dasd ', ' dasd ', ' sdasd ', ' asd ', ' Adas ', ' sdasd ', ' asdasd ', ' d ', ' ASD ']
Second, how to determine whether string a starts or ends with string b?
Practical cases
If a directory has the following files:
quicksort.c graph.py Heap.java install.sh stack.cpp ...
Now you need to give .sh
and .py
end the executable permissions on the folder
Solution Solutions
Using the startswith()
and endswith()
methods of strings
>>> import OS, stat >>> os.listdir ('./') [' Heap.java ', ' quicksort.c ', ' stack.cpp ', ' install.sh ', ' graph.py '] >>> [name for name in Os.listdir ('./') if Name.endswith (('. Sh ', '. Py ')] [' install.sh ', ' graph.py '] &G T;>> os.chmod (' install.sh ', Os.stat (' install.sh '). St_mode | Stat. S_IXUSR)
[root@iz28i253je0z t]# ls-l install.sh-rwxr--r--1 root root 0 Sep 18:13 install.sh
Third, how to adjust the format of the text in the string?
Practical cases
A log file for a software in which the date format is yyy-mm-dd
:
2016-09-15 18:27:26 statu unpacked python3-pip:all 2016-09-15 19:27:26 statu half-configured python3-pip:all 2016-09-15 2 0:27:26 statu installd python3-pip:all 2016-09-15 21:27:26 Configure Asdasdasdas:all Python3-pip:all
mm/dd/yyy
What do I need to change the format of the date to US date 2016-09-15 --> 09/15/2016
?
Solution Solutions
Using regular expression methods to re.sub()
do string substitution
Capturing groups of regular expressions captures the order of each part of the content, in the replacement string, of each capturing group.
>>> log = ' 2016-09-15 18:27:26 statu unpacked Python3-pip:all ' >>> import re # in order >>> re.sub (' (\d{4})-(\d{2})-(\d{2}) ', R ' \2/\3/\1 ', log) ' 09/15/2016 18:27:26 statu unpacked Python3-pip:all ' # using regular expression grouping >>&G T Re.sub (' (? P<YEAR>\D{4})-(? P<MONTH>\D{2})-(? P<day>\d{2}) ', R ' \g<month>/\g<day>/\g<year> ', log ' 09/15/2016 18:27:26 statu unpacked Python3-pip:all '
Iv. How to stitch multiple small strings into a large string?
Practical cases
When designing a network program, we customized a UDP-based network protocol to pass a series of parameters to the server in a fixed order:
Hwdetect: "<0112>" gxdepthbits: "<32>" gxresolution: "<1024x768>" Gxrefresh: "<60>" Fullalpha: "<1>" loddist: "<100.0>" Distcull: "<500.0>"
In the program we collect each parameter in the order of the list:
["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0>", "<500.0>"]
In the end, we're going to stitch each parameter together into a single packet to send:
"<0112><32><1024x768><60><1><100.0><500.0>"
Solution Solutions
Iterate through the list, sequentially stitching each string with ' + ' operations
>>> for N in ["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0>", " <500.0> "]: ... result + = n ... >>> result ' <0112><32><1024x768><60><1> <100.0><500.0> '
Use the str.join()
method to stitch all the strings in the list more quickly
>>> result = '. Join (["<0112>", "<32>", "<1024x768>", "<60>", "<1>", "<100.0 > "," <500.0> "]) >>> result ' <0112><32><1024x768><60><1><100.0 ><500.0> '
If there are numbers in the list, you can use the generator to convert:
>>> Hello = [222, ' SD ', 232, ' 2e ', 0.2] >>> '. Join (str (x) for x in Hello) ' 222sd2322e0.2 '
How do I align the string left, right, and center?
Practical cases
A dictionary stores a series of property values:
{' IP ': ' 127.0.0.1 ', ' blog ': ' www.anshengme.com ', ' title ': ' Hello world ', ' Port ': ' 80 '}
In the program, we want to output its contents in the following format, how to handle it?
ip:127.0.0.1 blog:www.anshengme.com Title:hello World port:80
Solution Solutions
str.ljust()
str.rjust,str.cente()
Align left and right centers using a string
>>> info = {' IP ': ' 127.0.0.1 ', ' blog ': ' www.anshengme.com ', ' title ': ' Hello world ', ' Port ': ' 80 '} # Gets the maximum length of keys in the dictionary >>> Max (Map (len, Info.keys ())) 5 >>> w = max (map (len, Info.keys ())) >>> for k in info: ... prin T (K.ljust (W), ': ', Info[k]) ... # get results port:80 blog:www.anshengme.com ip:127.0.0.1 Title:hello World
Use format()
methods to pass similar tasks like ' <20 ', ' >20 ', ' ^20 ' parameters
>>> for k in info: ... print (format (k, ' ^ ' +str (w)), ': ', info[k]) ... port:80 blog:www.anshengme.com ip:127.0 .0.1 Title:hello World
Vi. How do I remove unwanted characters from a string?
Practical cases
Filter out extra whitespace characters after user input card: anshengm.com@gmail.com
Filter ' \ r ' in edit text under a Windows: Hello word\r\n
Remove Unicode combination symbols (tones) from text: ' Níhǎo, Chīfàn '
Solution Solutions
String strip()
, lstrip(),rstrip()
method to remove both ends of the character string
>>> email = ' anshengm.com@gmail.com ' >>> email.strip () ' anshengm.com@gmail.com ' >>> Email.lstrip () ' anshengm.com@gmail.com ' >>> email.rstrip () ' anshengm.com@gmail.com ' >>>
Delete a fixed position character, you can use the Slice + stitching method
>>> S[:3] + s[4:] ' abc123 '
A replace()
method or regular expression of a string to re.sub()
remove any positional characters
>>> s = ' \tabc\t123\txyz ' >>> s.replace (' \ t ', ') ' abc123xyz '
Use to re.sub()
Delete multiple
>>> Import re >>> re.sub (' [\t\r] ', ' ', String ') ' ABC123XYZOPQ '
String translate()
method, you can delete many different characters at the same time
>>> Import string >>> s = ' abc123xyz ' >>> s.translate (String.maketrans (' abcxyz ', ' xyzabc ')) ' Xyz123abc '
>>> s = ' \rasd\t23\bads ' >>> s.translate (None, ' \r\t\b ') ' Asd23ads '
# python2.7 >>> i = U ' níhǎo, chīfàn ' >>> i u ' ni\u0301 ha\u030co, chi\u0304 fa\u0300n ' >>> I.translate (Dict.fromkeys ([0x0301, 0x030c, 0x0304, 0x0300])) u ' ni hao, chi fan '
Summarize
The above is for you to sort out the Python string processing skills, the article through the case, solution and examples to demonstrate how to solve, for everyone to learn or use Python has some reference value. There is a need to refer to the reference.