As one of the most popular languages in today's development industry, Python RSS files are currently used by many portal websites and Web logs, this is also one of the few Python tools that can work with RSS.
But it provides a very free parser that can handle all the messy differences in the RSS world. The following is an excerpt from the rssparser. py page. As you can see, most RSS feeds are terrible. Invalid characters, unescaped & symbols (supplied by Blogger), invalid entities (supplied by Radio), unescaped, and invalid HTML (usually provided by the Registration Center ).
Or just a general mix of Python RSS file elements and RSS 1.0 elements (Movable Type feeds )). There are still many supply lines that are too cutting-edge, just like Aaron's feed. He puts an excerpt into the description element and the complete text into the content: encoded element (like CDATA ). This is an effective RSS 1.0, but no one actually uses it (except Aaron), and almost no news clustering tool supports it.
And many parsers reject it. Other Resolvers are confused by the new element (guid) in RSS 0.94 (see Dave Winer for an example ). There is also the supply of Jon Udell, among which there is the fullitem element he selected from the creation. XML and Web services will increase interoperability, which is almost final, so it is really ridiculous to consider this. In any case, rssparser. py is designed to handle all these absurd situations.
Installing rssparser. py is also very simple. Download the Python file (reference document) and rename "“rssparser.py.txt" to "rssparser. py ". And copy it to your PYTHONPATH. I also recommend that you obtain the optional timeoutsocket module, which can improve the timeout behavior of socket operations in Python, so as to help get RSS feeds without stopping the application thread to prevent errors.
- import rssparser #Parse the data, returns a tuple: (data for channels, data for items)
- channel, items = rssparser.parse("http://www.python.org/channews.rdf") for item in items:
- #Each item is a dictionary mapping properties to values print "RSS Item:", item.get('link', "(none)")
- print "Title:", item.get('title', "(none)") print "Description:", item.get('description', "(none)")
As you can see, this code is very simple. RSS. py and rssparser. py cannot replace each other because they have more functional components and maintain more syntax information in the RSS feed. The latter is simpler and a parser with better fault tolerance (the RSS. py parser can only accept well-formed XML ).
The difference between a module and most other languages, such as C) is the boundary of a module, it is determined by the position of the first character of each line in this line, while the C language uses a pair of curly braces {} to clearly define the boundary of the module, is irrelevant to the character position ).
This has caused controversy. Since the birth of a language like C, the Syntactic meaning of the language is separated from the character arrangement, and it has been regarded as a progress of a programming language. But it is undeniable that by forcing programmers to indent all the places where they need to use modules, such as if, for, and function definition), the Python RSS file makes the program clearer and more beautiful.
In addition, Python also adheres to a clear and uniform style in the design of other parts, which makes the Python RSS file easy to use and easy to maintain, languages that are popular and widely used by a large number of users. The program segments directly written in Python sometimes run more efficiently than programs written in C.
- Introduction to Python system files
- How to correctly use Python Functions
- Detailed introduction and analysis of Python build tools
- Advantages of Python in PythonAndroid
- How to Use the Python module to parse the configuration file?