A brief introduction to the XML document making method of Baidu News open Protocol

Source: Internet
Author: User
Tags cdata

Overview of Open Protocols
With this open protocol, you will be able to bring more traffic to your site!

"Internet News Open Agreement" is Baidu News search developed by the search engine news sources included standards, Web site can be published news content to follow the open protocol of the XML format of the Web page (independent of the original news release form) for search engine index, the site published news information actively, timely to inform Baidu search engine.

The adoption of the "Internet News Open Agreement", the equivalent of the website of the news by the search engine subscriptions, through Baidu-the world's largest Chinese search engine this platform, netizens will have the possibility of a larger range of more frequent access to your site news, and thereby bring potential traffic to your site.

Open protocols are very simple! You can easily use it with our help.

Content of Open Agreement
The following image is an XML-formatted Web page in accordance with the Internet News Open Agreement, which lists information about the news posted on the website in a standard format.
XML Web page Example:

XML Label Description: A required option with an asterisk mark, with no asterisk marked as optional.
*<document>--marks the beginning and end of the entire XML file content.
*<website>--site address.
*<webmaster>--the responsible person's email. When necessary, we contact you through this address.
*<updateperi>--the update cycle, in minutes. Search engines will follow this cycle to visit the page, so that the news on the page more timely appear in the Baidu News.

*<item>--marks the beginning and end of each piece of news information. Within the tag is a single piece of news information, excluding news topics.
*<title>--news headlines.
*<link>--News URL address, with a single piece of news one by one correspondence, if the pagination of the news there are multiple URLs, equivalent to more than a piece of news.
<description>--News content Introduction.
*<text>--the full news body (contains only body text, no other characters such as HTML language). The purpose of this article is to make the news more and more accurate in the search results.
*<image>--News in the text of the relevant pictures, using an absolute address. If the news has no relevant pictures, can be empty, if there are more than one picture, please reuse the label. The aim of this article is to bring the relevant photo exhibition of the news to the search results now.
<keywords>--One or more keywords that reflect the content of a news topic, separated by spaces. This entry is for reference only, and the results of the search are not entirely dependent on the contents of this label.
<category>--News Classification, you can follow the site's own classification system, it is best to use the first class classification.
<author>--News author, can be an institution or individual.


<source>--news sources, i.e. original media or other institutions.
*<pubdate>--press release time, consistent with the release time on the news HTML page. Please be accurate to minutes, if your site is not recorded hours minutes, provide the date of year.

Recommended time format: month, day and minute seconds
such as: 2005-11-09 10:37 |  2005/11/09 10:37:00 | 2005.11.09 10:37:00 |
November 09, 2005 10:37 00 seconds | Fri, Nov 10:37:00 GMT


Open protocol Use
Before you use it, you need to know the following points:

Whether your website has become a source of Baidu news, or has not yet been included in Baidu News search, you can use this Open agreement.
The content provided by you using the Open agreement should be in full compliance with the following "news source collection standard".
The Internet News Open Agreement is only a kind of assistance and beneficial supplement to the original news source, instead of replacing it completely.
News source included Standard:
Baidu hopes to diversify its news sources and encourage original news content. If it is a large number of valuable news content and can be updated in a timely manner, legitimate media sites, and the stability of the Web server, high-speed, it is in line with the basic principles of Baidu's news source.
Baidu news search included content including politics, entertainment, sports, finance, science and education culture, social life and other news reports and media reviews, digital products, real estate, automotive and other market information and evaluation, the industry's dynamic and market, the organization of the work dynamic, is written or edited by professionals in Chinese information, Does not include published personal information, forums, blogs, ads, humor jokes, emotional stories, photo, Stills, star files, recipes, downloads, multimedia and other types of Internet information, other languages.
You should be responsible for all the content provided, to ensure that you provide the content of the authenticity, legality, and no infringement of the interests of any third party.


Here we go!
First step: Create an XML file
Please be sure to read the news source of Baidu News search before you create the XML file, and pay special attention to:

1, Baidu News search included news source website must comply with and strictly comply with the national "Internet News Information Service Management regulations", and in the press release and reprint process to respect the creators and the source site copyright.

2, Baidu News search is not suitable for the types of sites included: forums, blogs, corporate websites and so on.

3, Baidu News search does not include personal information, advertising, tendering, tutorials, humor jokes, emotional stories, photo, Stills, star files, recipes, downloads, multimedia and other types of Internet information, other languages.


4, Baidu News search hope to include high-quality Chinese news, not included in English and other non-Chinese news.

5. Please make the XML file according to the content of the Open agreement published above.


Other Description:

The

Supported encoding formats are GB2312, GB18030, UTF-8, BIG5, and are recommended for use in GB18030 or UTF-8 formats.
You can place all the news in one time segment of a Web site in an XML file, or in multiple XML files by channel or column.
Keep each XML file in a state that is continuously automatically updated by the update cycle. The update cycle can be adjusted at any time according to your needs.
Each XML file holds at most the latest 100 news releases, without saving the previous news.
Please sort the published news by Time, that is, the latest news is at the top, otherwise there may be news missing.
XML tag content, except for text text, which cannot contain any other code, the special characters in the following table must be converted to XML-defined escape characters. Otherwise, an error will result in the search engine not getting the news on the page. Character escape characters
HTML character encoding
and (and) & &amp; & #38;
Single quotes ' &apos; & #39;
Double quotes &quot; & #34;
Greater-than > &gt; & #62;
Less than < &lt; & #60; The "&" in the
  escape character does not need to be transferred.
 


We recommend that you use CDATA parts. A CDATA part with "<! The [cdata[] mark begins with the "]]>" tag and ends. Placing text containing code or special characters inside a CDATA Part eliminates the need for special characters to be escaped.
Step two: Validate the XML file
The following address provides a variety of tools to help you verify the structure of your XML file:
Http://www.w3.org/XML/Schema
Http://www.xml.com/pub/a/2000/12/13/schematools.html
The validated XML file enables you to provide more standard information, ensuring that your published news information is not overlooked by search engines.

Step three: Submit an XML URL
Before submitting, upload the XML file to your Web server, and enter the URL and other information of the XML file into the corresponding box below. The search engine will be directed to access the URL address, which needs to be resubmitted when the URL is changed.

If your site is in line with the news source included standards, Baidu News search will be submitted to your data for testing and observation for a week. If the XML file is created in accordance with the requirements of the Internet News Open protocol, but there is a problem, we will contact you based on the email address provided on the XML page.


Attention:

1, we will submit the XML file for review, Baidu News search does not guarantee that you will be able to include all the content submitted.


2, site name, address is required, the same site within one day to submit up to 5 different XML file addresses.

3. After submitting the address, please pay attention to the information in the pop-up window to confirm the successful submission.

Fourth step: Querying XML File status
You can enter the address of your submitted XML file in the box below to inquire about the processing progress and feedback of the file.
Note: The address you enter must be complete, that is, exactly the same as the address you submitted.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.