Python string details

Source: Internet
Author: User
Tags processing text string to number

Python string details

This article mainly introduces the details of Python strings. This article describes the knowledge about strings, some features of strings, the original strings, unicode strings, common operation methods of strings, and the list of built-in functions, for more information, see


String sequences are used to represent and store text. Strings in python cannot be changed once declared.

It is usually surrounded by single quotation marks ('), double quotation marks ("), and three quotation marks ('''").

The three quotation marks can be composed of multiple lines. They can be used to compile the quick syntax for multiple lines of text. common syntax document strings are used as comments in specific locations of files. Convenient multi-line comment

Python actually has three types of strings:

1. General meaning string (str)

2. The original string, starting with R or r in upper or lower case, r''. special characters are not escaped.

3. Unicode string, u'''basestring subclass

In Python, strings are "unchangeable sequences"

1. Immutable

2. Basic sequence operations, location-based access, slicing, and Indexing


1. Get help:

The Code is as follows:

>>> Help (str)

>>> Dir (str)

>>> Help (str. replace)

2. Immutable

After creation, you cannot change the characters in the same place (same as java). You cannot change the character division to an immutable sequence by assigning values to a specific position, the characters contained in these strings are in the left-to-right order and cannot be modified in the original place. The string in python is equivalent to a list of immutable sequences. Once declared, each character is fixed

This means you must create a new one if you want to change it!

The Code is as follows:

>>> S = 'spam'

>>> S [0] = 'K' # TypeError

# Modify a string similar to java and assign a value again

S = 'K' + s [1:]

Original string

The original String constant, r "abcd", (r/R) removes the backlash escape mechanism. Disable the escape mechanism, that is, \ does not indicate escape


1. Regular Expression

Used to process regular expressions and reduce backslash

The Code is as follows:

P4search = re. compile (R' \ s *')

2. System Path

It can easily represent the system path

The Code is as follows:

Path = r'e: \ book'

Unicode string

Unicode is a standard method for writing international text.

Python allows you to process Unicode text-you only need to add the prefix u or U before the string. For example, u "This is a Unicode string ."

BP: Uses Unicode strings when processing text files, especially when you know that this file contains text written in non-English languages.

Common Operations

1. Basic operations

The Code is as follows:

+: String1 + string2 # Join string, connect the last string to the back of the previous string

Python does not allow other types in the + expression. You need to manually convert them to [this is different from java] 'abc' + str (9)

*: String * n # create a new string to repeat the original string n times.

[]: String [n] # obtain a character from the string at the corresponding position

[:]: String [n: m] # truncate the string. If the string is m, start from m to m. If the string is n: From n to end.

In: char in string # determines whether a character is in the string. If True is returned)

Not in: char not in string # determines whether a character is not in the string. If True is returned)

R/R: r/Rstring # the actual meaning of escape characters. The entire character is the original meaning.

Len (): len (s)

2. type conversion

String and number Conversion

String to number int/float/long

Number to string str

The Code is as follows:

>>> Int (42)


>>> Int ('42 ')


>>> Str (42)


>>> Float ('42. 0 ')


>>> Str (42.0)

'42. 0'

Or use the functions of the string module.

S: string to be converted. base: Optional. The target is in hexadecimal format.

The Code is as follows:

Import string

String. atoi (s [, base]) # The default value of base is 10. If it is 0, s can be a string in the format of 012 or 0x23, if it is 16 s, it can only be a string in the form of 0x23 or 0X12

String. atol (s [, base]) # convert it to long

String. atof (s [, base]) # convert to float

String and list Conversion

String Conversion list:

The Code is as follows:

S = 'spam'

L = list (s)

L2 = "hello world". spilt ()

List to string

Copy the Code as follows:

K = ''. join (l)

Note: you cannot join non-strings in the list.

3. Modify the string

The Code is as follows:

S = s + 'A'

S = s [3:] + 'B'

S = s. replace ('pl', 'pa ')

A = ''# null value assignment

Del a # Delete the entire variable

4. indexing and partitioning

Index s [I]

The Code is as follows:

S [0] First

S [-1] = s [len (s)-1] The Last

Shard s [I: j]

The Code is as follows:

Excluding the upper boundary, s [] takes [1-2]

S [1:] Get 1 to end s [: 3] Get start to 2

S [:-1] starts to the second to the last

S [:] From start to end, equivalent to a copy

S [] 1-9, step size = 2

S [a: B:-2] The step size is negative. The two boundary meanings are reversed, indicating that the step size ranges from B + 1 to a, and the step size ranges from-2.

S = 'abcdefg'

S [5: 1:-1] to obtain fedc

S [] = s [slice ()] built-in Function

String formatting

Here we will only describe basic string formatting. Extended sections will introduce % c single character % d decimal integer % o octal integer % s string % x hexadecimal integer, lowercase letters % X hexadecimal integers, with uppercase letters

The Code is as follows:

>>> Str = "so % s a day! "

>>> Str % 'beautiful'

'So beautiful a day! '

>>> '{0} is {1}'. format ('A', 'B ')

'A is B'

>>> Template = "{0}, {1} and {2 }"

>>> Template. format ('A', 'B', 'C ')

'A, B and C'

Built-in function list

[String method is the No. 1 tool for python Text Processing]

String. capitalize ()

The first character of the string is capitalized.

String. center (width, [, fill])

The original character is centered, and the space is filled with the width length.

String. count (str, beg = 0, end = len (string ))

Obtains the number of substrings in a string. The number of occurrences can be specified.

String. decode (encoding = 'utf-8', errors = 'strict ')

Decodes the string. ValueError is reported by default, unless the errors is ignore or replace

String. encode (encoding = 'utf-8', errors = 'strict ')

String. endswith (suffix, beg = 0, end = len (string ))

End **?

String. expandtabs (tabsize = 8)

Convert the tab in the string to a space. The default value is 8.

String. find (str, beg = 0, end = len (stirng ))

Check whether str exists. If a start index exists,-1 is returned.

String. index (str, begin = 0, end = len (string ))

Same as find, no exception is reported, ValueError

String. isalnum ()

It must contain at least one character and all characters are letters or numbers. True: checks whether a string contains only 0-9A-Za-z

String. isalpha ()

At least one character. All characters are letters. True. check whether a string contains only letters.

String. isdecimal ()

Contains only the decimal number, True

Stirng. isdigit ()

Contains only numbers. True: checks whether a string contains only numbers.

String. islower ()

At least one case sensitive character and all characters in lower case, True. Check whether the string is lowercase letters

String. isnumeric ()

Contains only numeric characters, True

String. isspace ()

Contains only spaces. True: checks whether the string is blank.

String. istitle ()

Title character, True. checks whether the word in the string is capitalized.

String. isupper ()

At least one case-sensitive character and all characters are uppercase. True. checks whether the string is uppercase letters.

String. join (seq)

Using string as the separator, all elements in seq are merged into new strings. The original string is inserted between every two characters in the parameter string.

String. ljust (width)

Returns the left alignment of an original string, and adds spaces to the length width.

String. lower ()

Converts all strings to lowercase letters.

String. lstrip ()

Truncates spaces on the left.

String. partition (str)

= Find + split. The first position is displayed from str. It is truncated to pre_str, str, after_str tuples. If str is not included, pre_str = strstring. replace (str1, str2, num = string. replace count (str1) with a specified number of num times, which can be implemented as a template.

String. rfind (str, beg = 0, end = len (string ))

Same as find, start on the right

String. rindex (str, beg = 0, end = len (string ))

Same as index, start on the right

String. Must ust (width)

Right alignment, space filling

String. rpartition (str)

Same as partition, start on the right

String. rstrip ([chars])

Clear the white space on the right, including line breaks, and return the processed string

String. split (str = ", maxsplit = string. count (str ))

Split into str slices. You can specify the number of splits. the string is split and returns to the list. The default Delimiter is space.

String. splitlines (num = string. count ('\ n '))

S. splitlines ([keepends]) are separated by rows. You can specify the number of splits.

String. startswith (obj, beg = 0, end = len (string ))

Start with str, True. Check whether the string starts with a substring

String. strip ([obj])

Execute lstrip and rstrip on string

String. swapcase

Returns the string in uppercase and lowercase letters. Converts lowercase letters to uppercase letters and lowercase letters.

String. title ()

Title flowers, uppercase letters, and other lowercase letters

String. translate (str, del = "")

S. translate (table) returns string characters for table Conversion Based on str. The characters to be filtered are placed in the del parameter.

String. upper ()

Converts all strings to uppercase.

String. zfill (width)

Returns the string with the length of width. The original string is right-aligned and filled with 0 in front.

Len (string)

Returns the length of a string.

Best practices

1. Length Used in the loop

The Code is as follows:

While I <len (stri ):

# Modify

Size = len (stri)

While I <size

2. String appending

Copy the Code as follows:

L = ['A', 'B']

Result =''

For I in l:

Result + = I

# Modify

Result = ''. join (l)


1. Escape Character

Frequently used:

\ N line feed, \ backslash

\ T tabulation \ 'single quotes

\ R press enter \ "Double quotation marks

To be expanded later

Detailed description of string Encoding

String formatting

Regular Expression

String involves common modules (serialization/Text packaging, etc)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.