First, understand the structure and then do
To easily extract RSS information, nature must first understand its structure, the so-called "win" well.
1, the structure of RSS
We first open Baidu News an RSS link, if you open a few more other Web site RSS links, you will find that they have roughly the same structure. And we are in the revelation of RSS (on) for everyone to explain the fact is to implement such an XML file.
In order to be able to easily process such XML documents, in this article, we use C # as the language of development.
After analyzing the entire RSS link, we know that the RSS is roughly structured into figure 1.
2, the principle of extraction
Knowing the structure, we also need to know the meaning of each part of the structure. In Figure 1, the RSS node represents the current RSS file, which consists of a channel node and its child nodes, some of which provide information about the channel itself, such as the name of the title channel ("Baidu Internet News").
The channel node contains more than one item child node, and the item node is the part that the program needs to deal with, because it corresponds to each actual news item information, and each item node provides detailed information about the news through its child nodes, such as title of the News ("Microsoft im King" ), link corresponds to the actual links of the news.
RSS specific specifications can be viewed Http://blogs.law.harvard.edu/tech/rss
Knowing this, it is not difficult to program. All we need to do is extract and display the information under channel and item. Now look at the specific implementation.