"Python advanced" 02, Text Processing and IO in-depth understanding

Source: Internet
Author: User

1, there is a file, the word between the use of spaces, semicolons, commas, or periods separated, please extract all the words.

Solution:

use \w to match and extract words, but there is a miscarriage of judgment

Use Str.split to separate character strings, but multiple separators are required

Separating strings with Re.split

In [4]: "Help (Re.split)" Help "on Function" split in module Re:split (pattern, String, maxsplit=0, flags=0) split the source String by the occurrences of the pattern, returning a list containing the resulting substrings.
in [23]: text =  "I ' m xj, i love python,,linux;   i  don ' T like windows. " In [24]: fs = re.split (R "(, |\.|;| \s) +\s* ",  text) in [25]: fsout[25]: [" I ' M ",  '   ',  ' XJ ',  '   ',  ' I ',  '   ',  ' love ',  '   ',  ' Python ',  ', ',  ' Linux ',  '   ',  ' I ',   '   ',  "don ' t",  '   ',  ' like ',  '   ',  ' windows ',  '. ',  ']in  [26]: fs[::2]             #提取出单词Out [26]:  ["I ' M",  ' XJ ',  ' i ',  ' love ',  ' Python ',  ' Linux ',  ' i ',  "don ' t",  ' Like ',  ' windows ',  ']in [27]: fs[1::2]       #提取出符号Out [27]:  ['   ',  '   ',  '   ',  '   ',  ', ',  '   ',  '   ',  '   ',  '   ',  '. in [53]: fs = re.findall (r "[^,\.;\ s]+ ",  text) in [54]: fsout[54]: [" I ' M ",  ' XJ ',  ' i ',  ' love ',  ' Python ',   ' Linux ',  ' i ',  "don ' t",  ' like ',  ' windows ']in [55]: fh =  Re.findall (R ' [, \.;\ S] ',  text) in [56]: fhout[56]: ['   ',  ', ',  '   ',  '   ',  '   ',  ', ',  ', ',  '; ',  '   ',  '   ',  '   ',  '   ',  '. '

2, there is a directory, saved a number of files, find all of them C source files (. C and. h)

Solution:

Using Listdir

Use Str.endswith to judge

in [13]: s =  "XJ.C" In [14]: s.endswith (". C") out[14]: truein [15]:  s.endswith (". H") Out[15]: falsein [16]: import osin [17]: os.listdir ("/usr/ include/") out[17]: [' libmng.h ',  ' netipx ',  ' ft2build.h ',  ' ' FlexLexer.h ',  ' SELinux ',   ' qtsql ',  ' resolv.h ',  ' gio-unix-2.0 ',  ' wctype.h ' ",  ' python2.6 ',  ' ' scsi ',  .  . .  ' Qtopengl ',  ' mysql ',  ' byteswap.h ',,  ' xj.c '   ' mntent.h ',  ' Semaphore.h ',  ' stdio_ext.h ',  ' libxml2 '] in [21]: for filename in  Os.listdir ("/usr/include"):     if filename.endswith (". C"):         print filename   ....:          xj.cin [22]: for filename in os.listdir ("/usr/include"):     if filename.endswith (". C",   ". H ")):           #这里元祖是或的关系          print filename   ....:          libmng.hft2build.hflexlexer.hnss.hpng.hutime.hieee754.hfeatures.hxj.c...verto-module.hsemaphore.hstdio_ EXT.HIN [23]:


3. Fnmath Module

support for Shell-like wildcard characters

In [24]: help (Fnmatch)            # Case sensitivity is consistent with operating system Help on function fnmatch in module fnmatch:fnmatch (NAME, PAT)     Test whether FILENAME matches PATTERN.         Patterns are Unix shell style:         *       matches everything    ?        MATCHES ANY SINGLE CHARACTER    [SEQ]    matches any character in seq    [!seq]   matches any char not in seq        an  initial period in filename is not special.    both  filename and pattern are first case-normalized    if the operating system  Requires it.    if you don ' T want this, use fnmatchcase (Filename, pattern). ~ (END)  in [47]: fnmatch.fnmatch ("Sba.txt",  "*txt") out[47]:  Truein [48]: fnmatch.fnmatch ("Sba.txt",  "*t") out[48]: truein [49]:  Fnmatch.fnmatch ("Sba.txt",  "*b") Out[49]: falsein [50]: fnmatch.fnmatch ("Sba.txt",  "*b * ") out[50]: true

Case: you have a program that handles files, the file names are entered by the user, and you need to support the same wildcard characters as the shell.

[email protected] src]# cat test1.py #!/usr/local/bin/python2.7#coding:utf-8import osimport sysfrom fnmatch Import Fnma Tchret = [name for name in Os.listdir (sys.argv[1]) if Fnmatch (name, sys.argv[2])]print ret[[email protected] src]# Python2 .7 test1.py/usr/include/*.c[' XJ.C ']


4, Re.sub () text Substitution

in [53]: help (re.sub) help on function sub  In module re:sub (pattern, repl, string, count=0, flags=0)      Return the string obtained by replacing the leftmost     non-overlapping occurrences of the pattern in string by  the    replacement repl.  repl can be either a  String or a callable;    if a string, backslash escapes  in it are processed.  if it is    a callable ,  it ' s passed the match object and must return     a replacement string to be used. 

Case: There is a text, the date in the text using the%m/%d/%y format, you need to convert it all to%y-%m-%d format.

in [+]: Text = "Today is 11/08/2016, next class time 11/15/2016" in []: New_text = re.sub (R ' (\d+)/(\d+)/(\d+) ', R ' \3-\2- \1 ', text) in [the]: new_textout[57]: ' Today is 2016-08-11, next class time 2016-15-11 '


5, Str.format string Formatting

Case: You need to create a small template engine that does not require logical control, but needs to use variables to populate the template

in [[+]: Help (Str.format) Help on Method_descriptor:format (...)    S.format (*args, **kwargs), string Return A formatted version of S, using substitutions from args and Kwargs. The substitutions is identified by braces (' {' and '} '). (END)




This article is from the "Xiexiaojun" blog, make sure to keep this source http://xiexiaojun.blog.51cto.com/2305291/1870832

"Python advanced" 02, Text Processing and IO in-depth understanding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.