Language that describes how to extract structured data from a web page: parsley

Source: Internet
Author: User

Zheng @ playpoly Sr 20091127

Parsley is used in scrapy.
Parsley is an interesting little thing. It uses CSS, XPath, regular expressions, and JSON to describe how to extract structured data from webpages. It is estimated that crawlers/spider will define a set of similar templates. However, parsley also helps you implement the specific implementation in various development languages.

Basic Facts

Parselets is a segment (snippets) Written in parsley language ).

You can think that a parselet defines a set of actions to describe howCodeAccurately extract data, such as where the title is, how to get the title link, and how to extract the number of comments.

Parsley has various language implementation packages, including Ruby, Python, and C/C ++.
Pyparsley is the corresponding Python library.

Sample Code and Result

See: http://parselets.com/parselets/yc/15,

The Code on the left is what we usually call a template, and the result on the right is the extracted structured data.

So how does it become a reality?

 

Implementation

Install parsley and then uninstall.

 

Zhengyun 20091127 Beijing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.