Python Learning Notes (iv) "Turn"

Source: Internet
Author: User

1.Python fetch three-way comparison of hyperlinks (URLs) in pages (Htmlparser, pyquery, regular expressions)

2.Python provides the original string, as the name implies, preserves the meaning of the original character, does not escape the backslash and the character after the backslash, the way to declare the original string is to precede the string with ' R ' or ' R '. 3.findall can be directly used in the regular, regardless of escaping? 4.re. X Re. I5.? I?:-> match case 6. The most commonly used function in Python to get input from the keyboard is raw_input () and input (). It is best to use the former, which is returned as a string. 7.print printing output can be ' preceded by 8.urlopen after the read, the second is the STR type, the Open option plus the timeout9.except error type is best Unified exception to avoid accidental errors. 10.Python error ' ASCII ' codec can ' t decode byte 0xe5 in position 0:ordinal isn't in range (128), try decode, such as cannot write attempt encode into a byte stream. 11.Python fetches the 3 method comparison (Htmlparser, pyquery, regular expression) ==>http://www.myexception.cn/html-css/639814 of hyperlinks (URLs) in the page. The Html12.python determines whether NULL is available if XX is None or if not XX, which is applied more broadly and with better results. 13. Read URL read by line remove \n for lines in File.readlines ():    Line=line.strip (' \ n ') 14. For Urlsplit, Urlparse, Urlunparse Detailed Description:  http://www.cnblogs.com/huangcong/archive/2011/08/31/2160633.html http:// hi.baidu.com/springemp/item/64613c7457731517d0dcb3a7 15. Get the page status code, need requests module http://www.oschina.net/code/ snippet_862981_2303216.local variable ' xx ' referenced before assignment requires global 17. For URLs unchanged, content jumps, that is, the kind of anti-scanning, you can useUrllib Direct Open,catch error can be. ex:http://segmentfault.com/q/1010000000095769 nginx Configuration 18.urllib2.geturl () can get the final page after the jump, 302? 19. How to get the page status code:
F=urllib.urlopen ("xxxxxx")PrintF.getcode ()==========================ImportRequestsdefgetstatuscode (URL): R= Requests.get (URL, allow_redirects =False)returnR.status_code#The requests library used in 2.7 or 2.6 doesn't seem to be there.===========================Conn= Httplib. Httpconnection ("192.168.1.212"); #You can also use get to start a data submissionConn.request (method="POST", url="/newsadd.asp?action=newnew", body=params,headers=headers); #returns the processed dataResponse =Conn.getresponse (); #determine if commit is successfulifResponse.Status = = 302:
20.httplib request usage, GetResponse () for returning data 21.get_header probing for the existence of a remote file may require a closer look at whether or not to take empty

Python Learning Notes (iv) "Turn"

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.