The purpose of this section of code is to obtain information about tax arrears from HTTP://WWW.TAX.SH.GOV.CN/TYCX/TYCXQJSKNSRMDCTRL-GETQJSKNSRMD.PFV.
Through the Fiddler tool, grab the package to see how the site is pushed.
Through the tool we can find that there is post information, and the content of the post is, type NSRMC swdjh fzrxm Tsnr curpage
Therefore, we can model the request for the above URL.
The following code is accessed using the Proxy method:
Open the corresponding Web page by generating a Proxyhandler object.
#Coding:utf-8ImportSys,reImporturllib.requestImporthttp.client from_overlappedImportPostqueuedcompletionstatusfname="C:/users/songxiaodi/desktop/tax_file.txt"file= Open (fname,'W') forLineinchRange (1,170): Proxy_handler= Urllib.request.ProxyHandler ({'http':'11.1.0.10:80'}) Opener=Urllib.request.build_opener (proxy_handler) postdata= Urllib.parse.urlencode ({'type':'QY','NSRMC':"','Swdjh':"','FZRXM':"','Tsnr':'tue+sep+30+00%3a00%3a00+gmt%2b08%3a00+2014','Curpage': Line}) PostData= Postdata.encode ('Utf-8') Page= Opener.open ('HTTP://WWW.TAX.SH.GOV.CN/TYCX/TYCXQJSKNSRMDCTRL-GETQJSKNSRMD.PFV', postdata) HTML= str (Page.read (),'Utf-8') Reg=re.compile (R"\<td\> (. * Company. *) \<\/td\>") Print(Reg.findall (HTML)) content=reg.findall (HTML) forCcinchRange (0, Len (content)):#print ("--:" + str (cc) +CONTENT[CC])file.write (str (CONTENT[CC])) File.write ("\ n") File.close ()Print('OK')
If you do not use proxy access, you can use the Urllib.request.urlopen
1 #Coding:utf-82 ImportSys,re3 Importurllib.request4 Importhttp.client5 from_overlappedImportPostQueuedCompletionStatus6 7FName ="C:/users/songxiaodi/desktop/tax_file.txt"8File = Open (fname,'W')9 Ten forLineinchRange (1,170): One #=========================================================================== A #Proxy_handler = Urllib.request.ProxyHandler ({' http ': ' 11.1.0.10:80 '}) - #opener = Urllib.request.build_opener (Proxy_handler) - #=========================================================================== thePostData = Urllib.parse.urlencode ({'type':'QY','NSRMC':"','Swdjh':"','FZRXM':"','Tsnr':'tue+sep+30+00%3a00%3a00+gmt%2b08%3a00+2014','Curpage': Line}) -PostData = Postdata.encode ('Utf-8') -page = Urllib.request.urlopen ('HTTP://WWW.TAX.SH.GOV.CN/TYCX/TYCXQJSKNSRMDCTRL-GETQJSKNSRMD.PFV', PostData) -html = str (Page.read (),'Utf-8') +Reg=re.compile (R"\<td\> (. * Company. *) \<\/td\>") - Print(Reg.findall (HTML)) +Content=reg.findall (HTML) A forCcinchRange (0, Len (content)): at #print ("--:" + str (cc) +CONTENT[CC]) - file.write (str (CONTENT[CC) ) -File.write ("\ n") - file.close () - Print('OK')
Obtaining tax information from Shanghai tax