Pythonchallenge 2: Reptiles and regular expressions

Last Update:2016-03-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Topic:

Problem Solving Ideas:
The topic has been made clear that the characters may be in the source code of the Web page. Right-click on the page source code and find a section: Find rare characters in the mess below. Some people directly copy the long section below and then deal with it. I think it's a little rough and simple. My workaround is to crawl the Web page with Urllib2 and then get the text and process it through regular expressions.

Implementation method:

Import Urllib2import rereq = Urllib2.urlopen (' http://www.pythonchallenge.com/pc/def/ocr.html ') res = Req.read () mess = ' '. Join (Re.findall ('--) (. *)--", Res,re. S)) chars = '. Join (Re.findall (R ' [a-z]|[ a-z]| [0-9] ', mess)) print chars

Method Explanation:

Urllib2 a simple Urllib2.urlopen (URL). read () to get the content of the Web page.
In order to get the text to be processed, the crawled Web page content needs to be processed through regular expressions. For the handling of line breaks, here is a very simple way to add re in the FindAll method. s parameter, which will make '. ' can match any character including newline characters. If there is no re. The s parameter, '. ' will match any character that does not include a line break.
The FindAll method returns a list containing the characters that match to, and for the next step, add the elements from the list to a blank string by using the. Join method. ". Join means there are no separate symbols between the elements, '. '. Join indicates that the string is joined by a. Delimited, "can be any symbol."
Finally, match the uppercase and lowercase letters and numbers in the string. I just matched [a-z] at first, matching all lowercase letters. Although the result is the same, but the topic does not say characters is uppercase or lowercase numbers, so adding [a-z] and [0-9] will be more rigorous.

Output:
Equality

Replace the OCR in the URL with equality to enter the next level.

Pythonchallenge 2: Reptiles and regular expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pythonchallenge 2: Reptiles and regular expressions

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support