RSS parsing Feedpaser 2018-10-02

Source: Internet
Author: User

Ref: 77323673

RSS related Introduction
    1. Introduction to RSS: Https://wikipedia.org/wiki/RSS
    2. An introduction to the XML format for RSS: http://www.w3school.com.cn/rss/rss_syntax.asp
Feedparser
    1. Feedparser Installation
    1. Simplified Rss.xml
<?xml version= "1.0" encoding= "Utf-8"? ><feed xmlns= "Http://www.w3.org/2005/Atom" > <title type= "Text" > Blog Park _ Dictation </title> <subtitle type= "text" ></subtitle> <id>uuid  :70a1ed00-25f2-44e5-b74e-7e9f1e384f1c;id=5134</id> <updated>2018-09-29T09:06:43Z</updated> <author> <name> Dictation </name> <uri>http://www.cnblogs.com/qiulinzhang/</uri> </ author> <generator>feed.cnblogs.com</generator> <entry> <id>http://www.cnblogs.com/ qiulinzhang/p/9724748.html</id> <title type= "text" >pearson Correlation coefficient 2018-09-29-Dictation </ title> <summary type= "text" >pearson Correlation coefficient pearson correlation coefficient is a statistic used to reflect the linear correlation of two variables The simple correlation coefficients of samples are usually expressed in R, where n is the sample amount, and the observed and mean values of two variables are respectively. R describes the degree of linear correlation between two variables. R value between 1 and +1, if R 0, indicates two </summary> <published>2018-09-29T09:07:00Z</published> <updated>2018 -09-29t09:07:00z</updated> <author> <name> Dictation </name> <uri>http://www.cnblogs.com/qiulinzhang/</uri> </author> <link rel= "Alternate" href= "http://www.cnblogs.com/qiulinzhang/p/9724748.html"/> <link rel= "alternate" Ty Pe= "text/html" href= "http://www.cnblogs.com/qiulinzhang/p/9724748.html"/> <content type= "html" > "Summary" Pearson Correlation coefficient Pearson correlation coefficient is a simple correlation coefficient of a statistical sample used to reflect the linear correlation of two variables, which is generally expressed in R, where n is the sample amount, and the observed and mean values of two variables are respectively. R describes the degree of linear correlation between two variables. The value of R is between 1 and +1, if R 0, indicating two &lt;a href= "http://www.cnblogs.com/qiulinzhang/p/9724748.html" target= "_blank" &gt; Read the full text &lt;/a&gt;</content> </entry> <entry>...</entry> <entry>...</entry > <entry>...</entry> <entry>...</entry> <entry>...</entry> <entry&gt, .... </entry> <entry>...</entry> <entry>...</entry> <entry> <id>http://    Www.cnblogs.com/qiulinzhang/p/9570867.html</id><title type= "text" >sizeof () Usage-Dictation love </title> <summary type= "text" >1. The definition sizeof is an operator operator, not a function that returns the number of bytes of memory that an object or type occupies \ 2. Syntax sizeof object; sizeof object sizeof (object); sizeof (TYPE_NAME); For example sizeof (int) object </summary> <published>2018-09-01T08:53:00Z</published> <updated>2018-09 -01t08:53:00z</updated> <author> <name> Dictation </name> <uri>http://www.cnblogs.com/ qiulinzhang/</uri> </author> <link rel= "alternate" href= "http://www.cnblogs.com/qiulinzhang/p/ 9570867.html "/> <link rel=" alternate "type=" text/html "href=" http://www.cnblogs.com/qiulinzhang/p/9570867. html "/> <content type=" html > "Summary" 1. The definition sizeof is an operator operator, not a function that returns the number of bytes of memory that an object or type occupies \ 2. Syntax sizeof object; sizeof object sizeof (object); sizeof (TYPE_NAME); For example sizeof (int) object &lt;a href= "http://www.cnblogs.com/qiulinzhang/p/9570867.html" target= "_blank" &gt; Read full text &lt;/a&gt;</content> </entry></feed> 

Then use feedparser it to parse:

>>> import feedparser>>> feed = feedparser.parse(‘rss.xml‘)>>> print feed[‘feed‘][‘title‘]博客园_默写年华>>> print feed.feed.title #通过属性访问博客园_默写年华>>> print feed.entries[0].id #对应上面第一个 entry 的 idhttp://www.cnblogs.com/qiulinzhang/p/9724748.html>>> print feed[‘entries‘][-1][‘summary‘] #对应于最后一个 entry的 summary1. 定义 sizeof 是一个操作符 operator ,不是一个函数, 其作用是返回一个对象或类型所占的内存字节数 \ 2. 语法 sizeof object; //sizeof 对象 sizeof(object); sizeof(type_name); // 例如 sizeof(int) 对象>>> len(feed[‘entries‘])10

Note: Chinese garbled problem:
Unicode encoding does not display Chinese in tuples, only in encoded form, with a u in front of the format, Unicode, so the individual printing will print feed[‘feed‘][‘title‘] not be printed in the form of a meta-ancestor, so that Chinese can be typed
Python2 default is ASCII , and python3 default is unicode , so:
In the case of Python2, it's print feed[‘feed‘] all Unicode.
In the case of Python3 print feed[‘feed‘] can be correctly typed Chinese
Reference: http://blog.51cto.com/daimalaobing/2046659

RSS parsing Feedpaser 2018-10-02

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.