Reference
Liaoche's python tutorial: http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/ 001386832653051fd44e44e4f9e4ed08f3e5a5ab550358d000
Code:
1 #!/usr/bin/python2 3 #Import Module4 ImportSocket5 Importio6 7 #Create TCP Object8s =Socket.socket (socket.af_inet, socket. SOCK_STREAM)9 #Connect SinaTenS.connect (('www.sina.com.cn', 80)) One #Send Request AS.send ('get/http/1.1\r\nhost:www.sina.com.cn\r\nconnection:close\r\n\r\n') - #Receive Data -Buffer = [] the whileTrue: - #every time receive 1k data -D = S.RECV (1024) - ifD: + Buffer.append (d) - Else: + Break Adata ="'. Join (buffer) at #Close Socket -Header, html = Data.split ('\r\n\r\n', 1) - PrintHeader - #write receive data to file -With open ('sina.html','WB') as F: -F.write (HTML)
The main function is to simulate the browser to access the Web server and retrieve the return information from the Web server.
Python Crawl Sina Homepage Small example