Python-Text Processing of common functions

Source: Internet
Author: User

In life and work, python has always been a good helper. Among the many functions of python, I think text processing is the most commonly used. The following is a summary of common usage. The environment is python 3.3

0. Basics
In python, str objects are used to save strings. The establishment of the str object is very simple. Use single or double quotation marks or three single quotes. For example:


S = 'Nice '# output: nice
S = "nice" # output: nice
S = "Let's go" # output: Let's go
S = '"nice"' # output: "nice"
S = str (1) # output: 1
S = ''' nice
Day ''' # output: nice
# Output: day

In python, \ n represents a line break, and \ t represents a tab key.

In python, it is easy to reference a section of str. For example:


S = '000000'
S [0] # first character: output: 1
S [-1] # The first to last character: output: 9
S [: 2] # first 2 Characters: output: 12
S [-2:] # The last 2 Characters: output: 89
S [2:-2] # Remove the first two and the last two remaining characters output: 34567 in python to determine whether a string is in another string:


'Nice 'in 'Nice Day' # output: Truetask 1. produce strings in a certain format
In python, the str object has a method to implement this function. This method is str. format (* args, ** kwargs ). Example:


'1 + 2 = {0} '. format (1 + 2) # {0} is a placeholder, where 0 indicates the first one to be replaced. Output: 1 + 2 = 3
'{0 }:{ 1 }'. format ('Nice ', 'day') # {0}, {1} is a placeholder, {0} refers to the first replacement, replace with nice, {1} the second one is replaced with day. Output: nice: day actual use:

After taking a photo of my mobile phone, the name of the mobile phone is as follows:


IMG_20130812_145732.jpg
IMG_20130812_144559.jpg will be placed in different folders based on the date of the photo. The folder name is as follows:


2013-08-10
Therefore, we need to convert the name of the photo so that it can be mapped to the corresponding folder for conversion. The Code is as follows:


Def getName (name ):
Return '{0}-{1}-{2}'. format (name [], name [], name [])

GetName('IMG_20130812_145732.jpg ') # output: 2013-08-12task 2. replace a part of the string
There are two methods to replace, one is to use the method replace () that comes with the str object, and the other is to use sub (0 in the re module. For example:


# Replace
S = 'Nice Day'
S. replace ('Nice ', 'good') # s itself does not change, but a string is returned: output: good day

# Sub
Import re
S = 'cat1 cat2 cat3 in the XXX'
Re. sub ('cat [0-9] ', 'cat', s) # s itself does not change, but a string is returned: output: CAT in the xxx For sub in the re module, you need to understand the regular expression.

Task 3. Split string
Excel files can be separated by commas (,) everywhere. For such a string, we can split it into corresponding fields. To implement this function, use the built-in split method of the str object. For example:


S = 'one, two, three'
S. split (',') # output: ['one', 'two', 'three '] task 4. Merge strings
In addition to the split function, we can combine the split fields into a string. To implement this function, we mainly use the join method that comes with the str object. For example:


L = ['one', 'two', 'three ']
','. Join (l) # output: one, two, and three can also be seen in this module.

Task 5. Integration
There are many operations on strings. If you operate only one or two lines of strings, it does not show its power. At work, it is possible to process documents. Some documents are large and cannot be processed manually. In this case, python is useful.

For example, to export data from a table table_1 from a database, the exported data format is as follows:


Insert into table_1 (field1, filed2, field3)
Values (value1, value2, value3 );
...
Insert into table_1 (field1, filed2, field3)
Values (value1, value2, value3); the size of the file generated by the data is approximately 700 mb. To import data from this table to table_2 in another database, the table structure of table_1 and table_2 is the same, but the name is different. In this case, we can write a python script to replace table_1 with table_2. For example:


Path_in = 'table1. data'
Path_out = 'table2. data'
F_in = open (path_in)
F_out = open (path_out, 'w ')
For I in f_in.readlines ():
If 'insert into table_1 (field1, filed2, field3) 'in I:
F_out.write (I. repalce ('tabel _ 1', 'table _ 2 '))
Else:
F_out.write (I)
F_in.close ()
F_out.close () Concluding remarks
Using python adds a tool and an alternative to daily work. Some repetitive work can be handed over to machines to save time and improve efficiency.

 

 
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.