"Python advanced" 02, Text Processing and IO in-depth understanding

Last Update:2016-11-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1, there is a file, the word between the use of spaces, semicolons, commas, or periods separated, please extract all the words.

Solution:

use \w to match and extract words, but there is a miscarriage of judgment

Use Str.split to separate character strings, but multiple separators are required

Separating strings with Re.split

In [4]: "Help (Re.split)" Help "on Function" split in module Re:split (pattern, String, maxsplit=0, flags=0) split the source String by the occurrences of the pattern, returning a list containing the resulting substrings.

in [23]: text =  "I ' m xj, i love python,,linux;   i  don ' T like windows. " In [24]: fs = re.split (R "(, |\.|;| \s) +\s* ",  text) in [25]: fsout[25]: [" I ' M ",  '   ',  ' XJ ',  '   ',  ' I ',  '   ',  ' love ',  '   ',  ' Python ',  ', ',  ' Linux ',  '   ',  ' I ',   '   ',  "don ' t",  '   ',  ' like ',  '   ',  ' windows ',  '. ',  ']in  [26]: fs[::2]             #提取出单词Out [26]:  ["I ' M",  ' XJ ',  ' i ',  ' love ',  ' Python ',  ' Linux ',  ' i ',  "don ' t",  ' Like ',  ' windows ',  ']in [27]: fs[1::2]       #提取出符号Out [27]:  ['   ',  '   ',  '   ',  '   ',  ', ',  '   ',  '   ',  '   ',  '   ',  '. in [53]: fs = re.findall (r "[^,\.;\ s]+ ",  text) in [54]: fsout[54]: [" I ' M ",  ' XJ ',  ' i ',  ' love ',  ' Python ',   ' Linux ',  ' i ',  "don ' t",  ' like ',  ' windows ']in [55]: fh =  Re.findall (R ' [, \.;\ S] ',  text) in [56]: fhout[56]: ['   ',  ', ',  '   ',  '   ',  '   ',  ', ',  ', ',  '; ',  '   ',  '   ',  '   ',  '   ',  '. '

2, there is a directory, saved a number of files, find all of them C source files (. C and. h)

Solution:

Using Listdir

Use Str.endswith to judge

in [13]: s =  "XJ.C" In [14]: s.endswith (". C") out[14]: truein [15]:  s.endswith (". H") Out[15]: falsein [16]: import osin [17]: os.listdir ("/usr/ include/") out[17]: [' libmng.h ',  ' netipx ',  ' ft2build.h ',  ' ' FlexLexer.h ',  ' SELinux ',   ' qtsql ',  ' resolv.h ',  ' gio-unix-2.0 ',  ' wctype.h ' ",  ' python2.6 ',  ' ' scsi ',  .  . .  ' Qtopengl ',  ' mysql ',  ' byteswap.h ',,  ' xj.c '   ' mntent.h ',  ' Semaphore.h ',  ' stdio_ext.h ',  ' libxml2 '] in [21]: for filename in  Os.listdir ("/usr/include"):     if filename.endswith (". C"):         print filename   ....:          xj.cin [22]: for filename in os.listdir ("/usr/include"):     if filename.endswith (". C", &NBSP; ". H ")):           #这里元祖是或的关系          print filename   ....:          libmng.hft2build.hflexlexer.hnss.hpng.hutime.hieee754.hfeatures.hxj.c...verto-module.hsemaphore.hstdio_ EXT.HIN&NBSP;[23]:

3. Fnmath Module

support for Shell-like wildcard characters

In [24]: help (Fnmatch)            # Case sensitivity is consistent with operating system Help on function fnmatch in module fnmatch:fnmatch (NAME,&NBSP;PAT)     Test whether FILENAME matches PATTERN.         Patterns are Unix shell style:         *       matches everything    ?  &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;MATCHES&NBSP;ANY&NBSP;SINGLE&NBSP;CHARACTER&NBSP;&NBSP;&NBSP;&NBSP;[SEQ]    matches any character in seq    [!seq]   matches any char not in seq        an  initial period in filename is not special.    both  filename and pattern are first case-normalized    if the operating system  Requires it.    if you don ' T want this, use fnmatchcase (Filename, pattern). ~ (END)  in [47]: fnmatch.fnmatch ("Sba.txt",  "*txt") out[47]:  Truein [48]: fnmatch.fnmatch ("Sba.txt",  "*t") out[48]: truein [49]:  Fnmatch.fnmatch ("Sba.txt",  "*b") Out[49]: falsein [50]: fnmatch.fnmatch ("Sba.txt",  "*b * ") out[50]: true

Case: you have a program that handles files, the file names are entered by the user, and you need to support the same wildcard characters as the shell.

[email protected] src]# cat test1.py #!/usr/local/bin/python2.7#coding:utf-8import osimport sysfrom fnmatch Import Fnma Tchret = [name for name in Os.listdir (sys.argv[1]) if Fnmatch (name, sys.argv[2])]print ret[[email protected] src]# Python2 .7 test1.py/usr/include/*.c[' XJ.C ']

4, Re.sub () text Substitution

in [53]: help (re.sub) help on function sub  In module re:sub (pattern, repl, string, count=0, flags=0)      Return the string obtained by replacing the leftmost     non-overlapping occurrences of the pattern in string by  the    replacement repl.  repl can be either a  String or a callable;    if a string, backslash escapes  in it are processed.  if it is    a callable ,  it ' s passed the match object and must return     a replacement string to be used.

Case: There is a text, the date in the text using the%m/%d/%y format, you need to convert it all to%y-%m-%d format.

in [+]: Text = "Today is 11/08/2016, next class time 11/15/2016" in []: New_text = re.sub (R ' (\d+)/(\d+)/(\d+) ', R ' \3-\2- \1 ', text) in [the]: new_textout[57]: ' Today is 2016-08-11, next class time 2016-15-11 '

5, Str.format string Formatting

Case: You need to create a small template engine that does not require logical control, but needs to use variables to populate the template

in [[+]: Help (Str.format) Help on Method_descriptor:format (...)    S.format (*args, **kwargs), string Return A formatted version of S, using substitutions from args and Kwargs. The substitutions is identified by braces (' {' and '} '). (END)

This article is from the "Xiexiaojun" blog, make sure to keep this source http://xiexiaojun.blog.51cto.com/2305291/1870832

"Python advanced" 02, Text Processing and IO in-depth understanding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Python advanced" 02, Text Processing and IO in-depth understanding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

"Python advanced" 02, Text Processing and IO in-depth understanding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support