First, Shelve Module
Shelve (understanding), is a higher degree of encapsulation. When used only for previously designed files, you can ignore other files generated automatically by different platforms.
The middle format of the JSON is a string, write the file with W
Pickle intermediate format is bytes, write file with B
JSON is more commonly used when serializing
Import shelve
info1={' age ': +, ' height ': +, ' weight ': 80}
info2={' age ': "max, ' height ': Weight ': 80}
D=shelve.open (' DB.SHV ')
d[' Egon ']=info1
d[' Alex ']=info2
D.close ()
D=shelve.open (' DB.SHV ')
Print (d[' Egon ')
Print (d[' Alex ')
D.close ()
D=shelve.open (' DB.SHV ', writeback=true)
d[' Alex ' [' Age ']=10000
Print (d[' Alex ')
D.close ()
D=shelve.open (' DB.SHV ', writeback=true) #如果想改写, need to set writeback=true
Print (d[' Alex ')
D.close ()
Second, XML module
XML as a form of organization data
The elements under XML correspond to three traits, tag, attrib, text
#==========================================> Search
Import Xml.etree.ElementTree as ET
Tree=et.parse (' A.xml ')
Root=tree.getroot ()
Three ways to find nodes
Res=root.iter (' rank ') # is searched throughout the tree, and finds all
For item in Res:
Print (' = ' *50)
Print (item.tag) # label Signature
Print (Item.attrib) #属性
Print (Item.text) #文本内容
Res=root.find (' country ') # can only start looking at the next level of the current element. And just find one and end it.
Print (Res.tag)
Print (Res.attrib)
Print (Res.text)
Nh=res.find (' neighbor ')
Print (Nh.attrib)
Cy=root.findall (' country ') # can only start looking at the next level of the current element,
Print ([Item.attrib for item in CY])
#==========================================> Change
Import Xml.etree.ElementTree as ET
Tree=et.parse (' A.xml ')
Root=tree.getroot ()
For year in Root.iter (' year '):
YEAR.TEXT=STR (int (year.text) + 10)
year.attrib={' updated ': ' Yes ' #一般不会改tag
Tree.write (' A.xml ')
#==========================================> Increase
Import Xml.etree.ElementTree as ET
Tree=et.parse (' A.xml ')
Root=tree.getroot ()
For country in Root.iter (' Country '):
Year=country.find (' year ')
if int (year.text) > 2020:
Print (Country.attrib)
Ele=et. Element (' Egon ')
ele.attrib={' nb ': ' Yes '
ele.text= ' very handsome '
Country.append (Ele)
Country.remove (year)
Tree.write (' B.xml ')
Third, re module (regular)
Regular---are most commonly used in reptiles; other modules can be used to import help clear data when using crawlers, while regular is also available in other areas
Import re
Print (Re.findall (' \w ', ' ab 12\+-*&_ '))
Print (Re.findall (' \w ', ' ab 12\+-*&_ '))
Print (Re.findall (' \s ', ' ab \r1\n2\t\+-*&_ '))
Print (Re.findall (' \s ', ' ab \r1\n2\t\+-*&_ '))
Print (Re.findall (' \d ', ' ab \r1\n2\t\+-*&_ '))
Print (Re.findall (' \d ', ' ab \r1\n2\t\+-*&_ '))
Print (Re.findall (' \W_SB ', ' Egon Alex_sb123123wxx_sb,lxx_sb '))
Print (Re.findall (' \aalex ', ' Abcalex is Salexb '))
Print (Re.findall (' \aalex ', ' Alex is Salexb '))
Print (Re.findall (' ^alex ', ' Alex is Salexb '))
Print (Re.findall (' sb\z ', ' alexsb is SBALEXBSB '))
Print (Re.findall (' sb$ ', ' alexsb is SBALEXBSB '))
Print (Re.findall (' ^ebn$ ', ' ebn1 ')) #^ebn$ sift Out is EBN (beginning with ebn, ending with EBN)
Print (Re.findall (' a\nc ', ' a\nc a\tc A1c '))
\ t is a tab and represents a different number of empty numbers on different platforms
\aó^ #使用 ^
\zó$ #使用 $
# repeat match:
#. ? * + {M,n}. *. *?
1.: represents any character except the line break
. If you want to remove the line break, add re after any character other than the line break. Dotall
Print (Re.findall (' A.C ', ' abc A1c aAc AAAAACA\NC '))
Print (Re.findall (' A.C ', ' abc A1c aAc AAAAACA\NC ', re. Dotall))
2,? : Repeats 0 or 1 times on the left side of the character
? Cannot be used alone
Print (Re.findall (' ab ', ' a ab ABB abbb abbbb abbbb '))
3, *: Represents the left that one character appears 0 times or infinite times
Print (Re.findall (' ab* ', ' a ab ABB abbb abbbb abbbb a1bbbbbbb '))
4, +: Represents the left that a character appears 1 or infinite times
Print (Re.findall (' ab+ ', ' a ab ABB abbb abbbb abbbb a1bbbbbbb '))
5, {M,n}: Represents the left that one character appears m times to N times
Print (Re.findall (' ab ', ' a ab ABB abbb abbbb abbbb '))
Print (Re.findall (' ab{0,1} ', ' A ab ABB abbb abbbb abbbb '))
Print (Re.findall (' ab* ', ' a ab ABB abbb abbbb abbbb a1bbbbbbb '))
Print (Re.findall (' ab{0,} ', ' A ab ABB abbb abbbb abbbb a1bbbbbbb '))
Print (Re.findall (' ab+ ', ' a ab ABB abbb abbbb abbbb a1bbbbbbb '))
Print (Re.findall (' Ab{1,} ', ' A ab ABB abbb abbbb abbbb a1bbbbbbb '))
Print (Re.findall (' ab{1,3} ', ' A ab ABB abbb abbbb abbbb a1bbbbbbb '))
6. *: Match any length, any character ===== "greedy Match"
Print (Re.findall (' a.*c ', ' ac a123c aaaac a *123) () C ASDFASFDSADF '))
7,. *? : Non-greedy match
Print (Re.findall (' a.*?c ', ' a123c456c '))
(): Group
Print (Re.findall (' (Alex) _sb ', ' alex_sb asdfsafdafdaalex_sb '))
Print (Re.findall (
' Href= ' (. *?) "',
' <li><a id= ' Blog_nav_sitehome "class=" menu "href=" http://www.cnblogs.com/"> Blog Park </a></li> ')
)
[]: Matches a character within a specified range (this character is defined in parentheses)
[] Write what is its individual meaning, can write 0-9 a-za-z
Print (Re.findall (' a[0-9][0-9]c ', ' A1c a+c a2c a9c a11c a-c acc aAc '))
When-needs to be matched by a common symbol, it can only be placed on the leftmost or rightmost side of []
A-B has a special meaning, so if you want to-show it itself, put it on the leftmost or the right
Print (Re.findall (' a[-+*]c ', ' A1c a+c a2c a9c a*c a11c a-c acc aAc '))
Print (Re.findall (' a[a-za-z]c ', ' A1c a+c a2c a9c a*c a11c a-c acc aAc '))
The ^ in the [] represents the meaning of the inverse (^ in [] means to take the reverse)
Print (Re.findall (' a[^a-za-z]c ', ' A C A1C a+c a2c a9c a*c a11c a-c acc aAc '))
Print (Re.findall (' a[^0-9]c ', ' A C A1C a+c a2c a9c a*c a11c a-c acc aAc '))
Print (Re.findall (' ([a-z]+) _SB ', ' Egon Alex_sb123123wxxxxxxxxxxxxx_sb,lxx_sb '))
| : OR
Print (Re.findall (' Compan (ies|y) ', ' Too many companies has gone bankrupt, and the next one is my company ')
(?:): Represents all content that matches successfully, not just the parentheses (? :) indicates that the result of the match is to be, not only within ()
Print (Re.findall (' Compan (?: ies|y) ', ' Too many companies has gone bankrupt, and the next one is my company '))
Print (Re.findall (' ALEX|SB ', ' Alex SB sadfsadfasdfegon Alex SB Egon '))
Other methods of the RE module:
Print (Re.findall (' alex|sb ', ' 123123 Alex SB sadfsadfasdfegon Alex SB Egon '))
Print (Re.search (' alex|sb ', ' 123213 Alex SB sadfsadfasdfegon Alex SB Egon '). Group ())
Print (Re.search (' ^alex ', ' 123213 Alex SB sadfsadfasdfegon Alex SB Egon '))
Print (Re.search (' ^alex ', ' Alex SB sadfsadfasdfegon Alex SB Egon '). Group ())
Re.search, take the first result, if not return none, if you want the result to display directly after the group (); return to none with group () will give an error
Print (Re.match (' Alex ', ' Alex SB sadfsadfasdfegon Alex SB Egon '). Group ())
Print (Re.match (' Alex ', ' 123213 Alex SB sadfsadfasdfegon Alex SB Egon '))
Re.match equivalent to ^ version of search
info= ' A:b:c:d '
Print (Info.split (': '))
Print (Re.split (': ', info))
Info=r ' get:a.txt\3333/rwx '
Print (Re.split (' [: \\\/] ', info))
Re.split can use regular expressions internally compared to split
Print (' Egon is Beutifull Egon '. Replace (' Egon ', ' Egon ', 1))
Print (Re.sub (. *?) (Egon) (.*?) (Egon) (. *?) ', R ' \1\2\3egon\5 ', ' 123 Egon is beutifull Egon 123 '))
Print (Re.sub (' (LQZ) (. *?) (SB) ', R ' \3\2\1 ', R ' Lqz is SB ')
Print (Re.sub (' ([a-za-z]+) ([^a-za-z]+) ([a-za-z]+) ([^a-za-z]+] ([a-za-z]+) ', R ' \5\2\3\4\1 ', R ' lqzzzz123+ is SB '))
Re.sub Regular expressions can be used internally compared to replace
Pattern=re.compile (' Alex ')
Print (Pattern.findall (' Alex is Alex Alex ')
Print (Pattern.findall (' ALEXASDFSADFSADFASDFASDFASFD is Alex Alex '))
Python tour. Fourth. Modules and Packages 4.09