Pythonchallenge 2: Reptiles and regular expressions

Source: Internet
Author: User

Topic:

Problem Solving Ideas:
The topic has been made clear that the characters may be in the source code of the Web page. Right-click on the page source code and find a section: Find rare characters in the mess below. Some people directly copy the long section below and then deal with it. I think it's a little rough and simple. My workaround is to crawl the Web page with Urllib2 and then get the text and process it through regular expressions.

Implementation method:

Import Urllib2import rereq = Urllib2.urlopen (' http://www.pythonchallenge.com/pc/def/ocr.html ') res = Req.read () mess = ' '. Join (Re.findall ('--) (. *)--", Res,re. S)) chars = '. Join (Re.findall (R ' [a-z]|[ a-z]| [0-9] ', mess)) print chars

Method Explanation:

    1. Urllib2 a simple Urllib2.urlopen (URL). read () to get the content of the Web page.
    2. In order to get the text to be processed, the crawled Web page content needs to be processed through regular expressions. For the handling of line breaks, here is a very simple way to add re in the FindAll method. s parameter, which will make '. ' can match any character including newline characters. If there is no re. The s parameter, '. ' will match any character that does not include a line break.
    3. The FindAll method returns a list containing the characters that match to, and for the next step, add the elements from the list to a blank string by using the. Join method. ". Join means there are no separate symbols between the elements, '. '. Join indicates that the string is joined by a. Delimited, "can be any symbol."
    4. Finally, match the uppercase and lowercase letters and numbers in the string. I just matched [a-z] at first, matching all lowercase letters. Although the result is the same, but the topic does not say characters is uppercase or lowercase numbers, so adding [a-z] and [0-9] will be more rigorous.

Output:
Equality

Replace the OCR in the URL with equality to enter the next level.

Pythonchallenge 2: Reptiles and regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.