A string is an ordered set of characters used to store and represent text-based information.
Common string constants and expressions
T1= ' empty string
t2= "Diege '" double quotes
T3= "" "..." "" Sanchong Quote Block
T4=r ' \temp\diege ' raw string Suppress (cancel) escape, fully print \tmp\diege, without tabs
T5=u ' Diege ' Unicode string
T1+t2 Merging
T1*3 Repeat
T2[i] Index
T2[I:J] Shards
Len (T2) to seek long
"A%s Parrot"% type string formatting
T2.find (' IE ') string method invocation: Search
T2.rstrip () string method invocation: Removing spaces
T2.replace (' IE ', ' efk ') string method call: Replace
T2.split (', ') string method invocation: Split
T2.isdigit () string method invocation: Content test
T2.lower () string method call: uppercase converted to lowercase
For x in T2: iteration
' IE ' in T2 member relations
One, string constants
1, single double quote string is the same
Python automatically merges adjacent string constants in any expression. Although it is possible to add a + operator between them, it is clear that this is a merge operation.
>>> t2= "Test" ' for ' "Diege"
>>> T2
' Test for Diege '
>>> t2= "Test" + ' for ' + "Diege"
>>> T2
' Test for Diege '
You cannot add commas between strings to concatenate, which creates a tuple instead of a string. Python tends to print all these form strings as single quotes, unless there is a single quotation mark inside the string.
However, you can also escape the embedded quotes with a backslash
>>> t2= "Test" + ' for ' + "Diege ' s"
>>> T2
"Test for Diege ' s"
>>> ' diege\ ' s '
"Diege ' s"
2. Use escape sequences to represent special bytes
\newline neglect (continuous)
\ \ backslash (reserved \)
\ ' single quotation mark (Reserved ')
\ "Double quotation mark (keep")
\ nthe line break
\f Page Change
\ t Horizontal tab
\v Vertical Tab
\b The character before the setback.
\a Bells
\ r Returns the previous character is gone.
\N{ID} Unicode Database ID
\uhhhh hexadecimal value of Unicode16 bit
\uhhhh hexadecimal value of Unicode32 bit
\XHH hexadecimal value
\ooo octal value
NULL (not end of string)
\other not escaped (reserved)
3. String suppression Escape
Myfile=open (' C:\new\text.data ', ' W ')
This call attempts to open the C: (newline) EW (tab) ext.data file instead of the expected result.
Workaround, use the raw string. If the letter R (uppercase or lowercase) appears in front of the first quotation mark of the string, it closes the escape mechanism.
Myfile=open (R ' C:\new\text.data ', ' W ') '
Another way is to escape \
Myfile=open (' C:\\new\\text.data ', ' W ') '
4, Sanchong quotation marks to write multi-line string blocks
Block string, writing multi-line text data convenient syntax.
This form starts with a triple quotation mark (both single and double quotes), followed by the code of any number of lines, and ends with the same triple quotation marks at the beginning. The single quotation marks embedded in this string literal are also but must not be escaped. Sanchong quotes strings are also commonly used in the development process as a hack style method to repeal some code. If you want some code to not work, and then run the code again, you can simply add Sanchong quotes before these lines.
x=10
"""
Import OS
Print OS.GETCWD ()
"""
Y=19
5. Character set with larger string encoding
Unicode strings are sometimes referred to as "wide" strings. Because each string might occupy more than one byte of space in memory.
Unicode strings are typically applied to applications that support internationalization (I18)
Write a Unicode string by adding the letter U (uppercase and lowercase) before the opening quotation marks.
>>> t9=u ' Diege ' #这种语法产生了一个unicode字符串对象.
>>> T9
U ' diege '
>>> type (T9)
<type ' Unicode ' >
Python allows expressions to freely mix Unicode strings and generic strings. and convert the result of the mixed type to Unicode.
Unicode strings can also be merged, indexed, and fragmented. The re module is matched and cannot be modified on the ground. Is the same as a normal string.
Python treats generic strings and Unicode strings equally
If you need to convert in generic strings and Unicode strings, you can use the built-in STR and Unicode functions
>>> str (u ' diege ')
' Diege '
>>> Unicode (' Diege ')
U ' diege '
Unicode is used to handle multibyte characters, so you can use a special "\u", "\u" escape string to encode a binary value greater than 8bit
U ' AB\X20CD '
The SYS module contains the acquisition of the default Unicode encoding scheme and the invocation of the setting (which is often ASCII by default)
Can mix raw and Unicode strings
Second, the actual application of the string
1. Basic operation
String length acquisition Method: Built-in function Len ()
>>> len (' Test ')
4
String Connection: +
>>> ' test ' + ' Diege '
' Testdiege '
Python does not allow the + expression to mix strings and numbers.
String Repetition: *
>>> ' Test '
' Testtesttest '
Used in the split hint
>>> print '-' *80
Iterate: Use a For statement to iterate through a string
>>> for S in Myname:print s
...
D
I
E
G
E
The For loop assigns a variable to get the elements of a sequence, and executes one or more statements for each element.
Member Relationship testing: Use the in expression operator for membership testing.
>>> ' G ' in myname
True
>>> ' K ' in MyName
False
2. Indexes and Shards
A character in a string is a string of characters that is obtained at a specific position by indexing (extracted by providing the numeric offset of the desired element in the square brackets after the string).
The Ptyhon offset is starting from 0. Negative offsets are supported.
Index
>>> T[0],t[-2]
(' d ', ' G ')
Sharding
t[start: end] contains the start position, not the end position
>>> T[1:3],t[1:],t[:3],t[-1],t[0:-1]
(' IE ', ' iege ', ' die ', ' e ', ' Dieg ')
>>> t[:]
' Diege '
Simple summary:
* Index (S[i]) gets the element with a specific offset.
--The first element has an offset of 0
--(S[0]) gets the first element.
--negative offset index means counting from the last or right reverse
-(S[-2]) gets the second-to-last element (just like S[len (S)-2)
* Shard [s[i:j] extracts the corresponding part as a sequence
--The right border is not included
--The boundary of the Shard defaults to 0 and the length of the sequence, if not given, s[:]
--(S[1:3]) gets the element from offset to 1, but not including offset to 3
--(s[1:]) Gets the element from the offset from 1 to the end
--(S[:3]) Gets the element that is offset from 0 until it does not include an offset of 3
--(S[:-1]) gets the element from offset to 0 until but not including the last element
--(s[:]) Gets the element from the offset from 0 to the end, which effectively implements the top-level S copy
A copy of the same value, but a different memory area of the object. Object strings Such immutable objects are not very useful, but are useful for objects that can be modified on the ground.
such as lists.
3. Extended Shard: Third limit value stepping
Full form: X[i:j:k]: This identifies the element of the index X object, from the offset to I until J-1, indexed once per K element. Third limit value, K, default = 1
Instance
>>> s= ' Abcdefghijk '
>>> S[1:10]
' Bcdefghij '
>>> S[1:10:2]
' Bdfhj
You can also use negative numbers as stepping.
Shard expression
>>> "Hello" [::-1]
' Olleh '
The meaning of the two boundaries is actually reversed by a negative number.
Import Sys
Print SYS.ARGV
# python echo.py-a-b-c
[' echo.py ', '-a ', '-B ', '-C ']
echo.py Content
Import Sys
Print Sys.argv[1:]
# python echo.py-a-b-c
['-A ', '-B ', '-C ']
4. String Conversion Tool
One of Pyhon's design motto is the temptation to refuse speculation.
Python does not allow numbers and strings to be added, even if the instant string looks like a number.
>>> ' 55 ' +1
Traceback (most recent):
File "<stdin>", line 1, in <module>
Typeerror:cannot concatenate ' str ' and ' int ' objects
+ You can also merge operations with addition operations, which are ambiguous. To avoid such a syntax.
Resolving script files and user interfaces get a number that appears as a string.
The workaround is to use the conversion tool to pre-process the string as a number, or to take a number as a string.
Methods of converting numbers to strings
>>> Str (55)
' 55 '
>>> ' 55 '
' 55
>>> T=repr (55)
>>> type (T)
<type ' str ' >
Convert strings to Numbers
>>> Int (' 66 ')
66
>>> D=int (' 66 ')
>>> type (D)
<type ' int ' >
These actions are re-created for the object.
>>> s= ' 55 '
>>> x=1
>>> Int (S) +x
56
A similar built-in function can convert a floating-point number to a string, or convert a string to a floating-point number
>>> Str (3.1415), float ("1.5")
(' 3.1415 ', 1.5)
>>> text= ' 1.234E-10 '
>>> Float (text)
1.234e-10
The built-in eval function will run a string containing the Python expression code and be able to convert a string to any type of object.
functions int and float can only convert numbers.
* * String Code Conversion * *
Also, a single character can be converted to its corresponding ASCII code by passing it to the built-in ORD function-This function actually returns the binary of the character's corresponding character in memory.
The built-in CHR function converts the binary into a character.
>>> ord (' t ')
116
>>> chr (116)
' t '
5. Modify the string
Immutable sequence, cannot modify a string in the field (such as assigning a value to an index)
If you need to change a string, you need to use tools such as merge, Shard, and so on to create and assign a new string, if necessary, then assign the result to the original variable name of the string.
>>> s= ' Diege '
>>> s= ' My name is ' + S
>>> S
' My name Isdiege '
This modification of the original object did not change, just created a new string object, connected to the new object with the original variable name.
>>> t= ' Diege '
>>> S=t[:3] + ' bad ' +t[:-1]
>>> S
' Diebaddieg '
Each time a string is modified, a new string object is produced.
Third, string formatting
Methods for formatting strings:
1), place a "string" that needs to be formatted on the left side of the% operator with one or more embedded translation targets, starting with% (e.g.%d)
2), place an object (or multiple, in parentheses) to the right of the% operator, and these objects will be inserted into the left side where you want Python to format the string (or more) to convert the target.
>>> name= ' Diege '
>>> "My name is:%s"% name
' My name Is:diege '
>>> name= ' Diege '
>>> age=18
>>> "My name is:%s my age is%d"% (name,age)
' My name Is:diege my age is 18 '
>>> "%d%s%d You"% (1, ' Diege ', 4)
' 1 Diege 4 you
>>> "%s--%s--%s"% (42,3.1415,[1,2,4])
' 42--3.1415--[1, 2, 4]
This example inserts three values, an integer, a floating-point number, a Table object, but notice that all left-side targets are%s, which means that they are converted to strings. Since any object can be converted to a string (used when printing), each object type that participates in the operation with%s can convert the code. Because of this, unless you want to do a special format, you just need to remember to use the%s code to format the expression.
Formatting always returns a new string as a result instead of modifying the string on the left. Because the string is immutable, this is the only way to do so. If you want, you can assign a variable name to hold the result.
1. More advanced string formatting
Python string formatting supports all normal printf format code in C (but does not display results as printf does, but returns results). Some of the formatting codes in the table provide different choices for formatting the same type.
Code meaning
%s string (or any object)
%r s, but with repr instead of STR
%c character
%d decimal (integer)
%i Integer
%u no number (integer)
%o Eight-binary integers
%x hexadecimal integer
%x X, but print uppercase
%e Floating Point Index
%E floating point, but print uppercase
%f floating Point Decimal
%g floating point E or F
%G floating point E or F
Percent constant%
The translation target on the left side of the expression supports multiple conversion operations. These operations own a fairly rigorous set of syntax. The general structure of the conversion target looks like this:
$[(name)][flags][width][.precision]code
Reference dictionary index key, padding flag, width
Minus sign left justified
Positive plus right align
>>> x=1234
>>> res= "test:...%d...%-6d...%06d"% (x,x,x)
>>> Res
' Test: ... 1234...1234 ... 001234 '
res= "test:...%d...%6d...%-06d"% (x,x,x)
%6D right-aligned width 6 insufficient space complement
%-06d left alignment width 6 Not enough 0 complement
2. Dictionary-based string formatting
The formatting of the string also allows the left translation target to reference the keys in the right dictionary to extract the corresponding values.
>>> "% (n) d% (x) s"% {"n": 1, "x": ' Diege '}
' 1 Diege '
(n) (x) refers to the keys in the right-hand dictionary and extracts their corresponding values. Programs that generate HTML-like or XML often take advantage of this technique.
>>> reply= "" "
... Greetings.
... Hello% (name)s!
... Your Age was% (age)s
... """
>>> values={' name ': ' Diege ', ' Age ': 18}
>>> Print reply% values
Greetings.
Hello diege!
Your Age is 18
Such tricks are often used in conjunction with the built-in function VARs, which returns a dictionary that contains all the variables that exist when the function is called.
>>> name= ' Diege '
>>> age= ' 18 '
>>> VARs ()
{' S ': ' Diebaddieg ', ' res ': ' Test: ... 1234 ... 1234 ... 1234 ', ' D ':, ' __builtins__ ': <module ' __builtin__ ' (built-in), ' text ': ' 1.234E-10 ', ' age ': ' 18 ',
' MyName ': ' Diege ', ' __package__ ': None, ' s ': ' e ', ' values ': {' age ': ' ' name ': ' Diege '}, ' T ': ' Diege ', ' x ': 1234, ' reply ' : ' \ngreetings.\nhello% (name) s!\nyour
Age was% (age) s\n ', ' __name__ ': ' __main__ ', ' __doc__ ': None, ' name ': ' Diege '}
>>> ' My name is% ' (name) s is% (age) S "% VARs ()
' My name is Diege 18 '
Four, String method
In addition to the expression operators, strings provide a series of methods to implement more complex text processing tasks. A method is a function that is associated with a particular object in some way. From a technical point of view, they are attached to the properties of the object, and these properties are simply callable functions. In Python, there are different methods for different object types. String methods are limited to string objects. The function is the code package, and the method call takes two operations (one fetch property and one function call at a time)
Property Read
An expression with the Object.attribute format can be interpreted as "reading the value of the property attribute of the object."
Function-Call expression
An expression with a function (parameter) format means "Calling the function code, passing 0 or more comma-separated parameter objects, and finally returning the function's return value."
Merging the two allows us to invoke an object method. The method invokes the expression object, and the method (parameter) runs from left to right, that is, Python reads the object method first, then invokes it, passing arguments. If a method evaluates a result, it will be returned as the result of the entire method invocation expression.
Methods that are callable by most objects. And all objects can be accessed through the syntax of the same method invocation. In order to invoke the object's method, you must ensure that the object is present.
1. String method Instance: modify String
String method
>>> dir (S)
[' __add__ ', ' __class__ ', ' __contains__ ', ' __delattr__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ' ', ' __getitem__ ', ' __getnewargs__ ',
' __getslice__ ', ' __gt__ ', ' __hash__ ', ' __init__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __mod__ ', ' __mul__ ', ' __ne__ ', ' __new_ ' _ ', ' __reduce__ ', ' __reduce_ex__ ',
' __repr__ ', ' __rmod__ ', ' __rmul__ ', ' __setattr__ ', ' __sizeof__ ', ' __str__ ', ' __subclasshook__ ', ' _formatter_field_ Name_split ', ' _formatter_parser ',
' Capitalize ', ' center ', ' count ', ' decode ', ' encode ', ' endswith ', ' expandtabs ', ' find ', ' format ', ' Index ', ' isalnum ', ' Isa Lpha ', ' isdigit ', ' islower ', ' isspace ',
' Istitle ', ' isupper ', ' join ', ' ljust ', ' lower ', ' lstrip ', ' partition ', ' replace ', ' rfind ', ' rindex ', ' rjust ', ' rpartition ' ', ' rsplit ', ' Rstrip ', ' Split ',
' Splitlines ', ' startswith ', ' strip ', ' swapcase ', ' title ', ' Translate ', ' upper ', ' Zfill ']
You can see how the method uses the Help () function
>>> Help (S.isupper ())
1) Replace
Replace 3rd, 4 characters
>>> s= ' Namediege '
>>> S=s[:3] + ' XX ' +s[5:]
>>> S
' Namxxiege '
To replace only one substring, you can use the Replace method to implement the
>>> s= ' AABBCCDD '
>>> s=s.replace (' BB ', ' GG ')
>>> S
' AAGGCCDD '
Replace the first argument is the original string (any length), and the second parameter replaces the string of the original string (any length)
2) Find
The Find method returns the offset at the occurrence of the substring (the default is to start from the previous search) or 1 if it is not found.
3) scatter list ()
List method
>>> s= ' Diege '
>>> List (S)
[' d ', ' I ', ' e ', ' g ', ' e ']
Will break up into a list
4) Synthetic jion () method
>>> s= ' Diege '
>>> List (S)
[' d ', ' I ', ' e ', ' g ', ' e ']
>>> t=list (S)
>>> T
[' d ', ' I ', ' e ', ' g ', ' e ']
>>> t[0]= ' P '
>>> t[3]= ' G '
>>> T
[' P ', ' I ', ' e ', ' G ', ' e ']
>>> s= '. Join (T) #使用空字符串分割把字符列表转换为字符串
>>> S
' Piege '
>>> y= ' | '. Join (T)
>>> Y
' P|i|e| G|e ' #使用 | Splitting a string list into a string
>>> ' X '. Join ([' eggs ', ' toast ', ' Moa '])
' Eggsxtoastxmo
2. String method Instance: text parsing
1) Use sharding for text parsing
>>> line= "AAA BBB CCC"
>>> Cols1=line[0:3]
>>> Cols2=line[8:]
>>> COLS1
' AAA '
>>> COLS2
' CCC.
The group data appears at a fixed offset, so it is possible to divide it from the original string by sharding. This technique can be thought of as parsing, as long as the required data set key has a fixed offset.
2) Split method extract Component
Use the Split method to extract components when the required data does not have a fixed offset. In a string, the data appears anywhere, and this method works.
>>> line= ' AAA BBB CCC '
>>> Cols=line.split ()
>>> cols
[' AAA ', ' BBB ', ' CCC ']
The split method of the string splits a string into a list of substrings with a separator. The default delimiter is a space-the string is divided into groups by one or more spaces, tabs, or newline characters, and then we get a list of the final substrings.
>>> names= ' diege,kelly,lily '
>>> Names.split ()
[' diege,kelly,lily ']
>>> names.split (', ')
[' Diege ', ' Kelly ', ' Lily ']
3. Other common string methods in practical application
Other string methods have a more focused role
Clears whitespace at the end of each line, performs case conversions, and detects substrings at the end.
>>> line= ' The python is running!\n '
>>> Line.rstrip ()
' The Python is running!
>>> Line.upper ()
' The PYTHON is running!\n
>>> Line.isalpha ()
False
>>> line.endswith (' ing!\n ')
True
>>> line.find (' ing ')!=-1
True
Note there is no string support pattern---for pattern-based text processing, you must use the Python RE standard library module. String methods sometimes compare with the tools of the RE module and have the advantage of running speed
4, the original string module
The original string module is a string module that contains approximately the same function as the current set of string methods. Now you should use only the string method, not the original string module
V. Type classification in the usual sense
1. The same type of classification shares its set of operations
A string is an immutable sequence that cannot be changed in place, and is a set of position-related sorting. In Python, all series data types----merge, index, iterate--are supported for sequence operations. Similar to a sequence operation, there are three types (and operations) in Ptyhon,
* Digital
Supports addition, multiplication, etc.
* Sequence
Supports indexing, sharding, merging, and more
* Map
Supports indexing through keys, and so on.
For example, for any sequence object X and y:
X+y will create a new sequence object containing the contents of two operand objects
X*n will contain a new sequence object containing N copies of the contents of the operand x
In other words, these operations work for any sequence object, including strings, lists, tuples, and user-defined object types. The type of object will tell Python what kind of task to perform.
2, variable type can be modified in situ
Immutable classifications are constraints that require special attention. If an object is immutable, then it cannot be modified in its original value. Workaround, you must run the code to create a new object to contain this new value. An immutable type has some integrity that ensures that the object is not altered by other parts of the program.
Variable types can be modified in place, and the original data can be modified as needed.
A small summary of methods and expressions:
Methods are type-specific and not universal
Expressions are generic and can be used in many types. For example, slices in the supported sequence of object types: strings, lists, tuples in general.
Python Learning notes Collation (iv) strings in Python: