DotText source code reading (7)-Pingback/TrackBack

Source: Internet
Author: User
The difference between a blog service and a forum or a so-called Collection website is largely due to the existence of pingback/trackback, so that self-media such as blogs can be extended into SNS. Therefore, to analyze the blog program, we need to understand the implementation details of this Protocol and the Protocol. In the source code of dottext, we can see that pingback is supported in the published works, and trackback is implemented in the implementation of web services. As for what is the piongback/trackback protocol, google should be able to find it, and I do not have to pay for it. <HttpHandlerpattern = "/(?: Admin) "type =" Dottext. web. UI. handlers. blogExistingPageHandler, Dottext. web "handlerType =" Factory "/> ing allows us to access the admin directory of each blog, will UrlRewrite to the corresponding aspx file under the dottexweb \ admin directory (refer to the previous section). When posting a post, we can see that this call relationship is: private void UpdatePost () {if (Page. isValid) {string successMessage = Constants. RES_SUCCESSNEW; try {Entry entry = new Entry (EntryType); entry. title = txbTitle. text; entry. body = Globals. stripRTB (FtbBody. Text, Request. Url. Host );... Entry. blogID = Config. currentBlog (Context ). blogID; if (PostID> 0) {// is the update operation successMessage = Constants. RES_SUCCESSEDIT; entry. dateUpdated = DateTime. now; // BlogTime. currentBloggerTime; entry. entryID = PostID ;... Entries. Update (entry );...} Else {// Create Operation entry. DateCreated = DateTime. Now; // BlogTime. CurrentBloggerTime; PostID = Entries. Create (entry );}...} Catch (Exception ex ){...} Finally {... }}} Entries. create (entry); is like this: public static int Create (Entry entry, int [] CategoryIDs) {HandlerManager. preCommit (entry, ProcessAction. insert); int result = DTOProvider. instance (). create (entry, CategoryIDs); if (result> 0) {HandlerManager. postCommit (entry, ProcessAction. insert);} return result;} the final data storage test calls DTOProvider, that is, DataDTOProvider, and finally falls into SqlDataProvider for data storage operations. However, we noticed the HandlerManager. PostCommit (entry, ProcessAction. Insert); operation. Take a closer look: HandlerManager is a Wrapper class for the Entry operation class. PreCommit is defined as follows: Process (ProcessState. preCommit, e, pa); While Process reads the web. config's public static void Process (ProcessState ps, Entry e, ProcessAction pa) {// Do we have factories? Are you sure you want to use the factory mode? EntryHandler [] hanlers = Config. Settings. EntryHandlers; // This is deserialization. The Config here is Dottext. Framework. Configuration. Config if (e! = Null & hanlers! = Null) {// walk the entries traversal all processing routines for (int I = 0; I

TrackBackPing: string pageText = BlogRequest. getPageText (link, e. link); the source code of the referenced address will be downloaded using the http protocol of BlogRequest, and then the link is the address of another blog, and e. link is reffer to inform the recipient that the page references link. After safe decoding, the source code of the link is obtained. Then, TrackBackPing will analyze and find the string sPattern = @ "<rdf: \ w + \ s [^>] *?> (</Rdf: rdf> )? "; The matched part analyzes the reference announcement address. The next step is to use SendPing (string trackBackItem, string parameters) to post an application/x-www-form-urlencoded data to the target address. This completes one trackBack.

Several other EntryHandler are also divided into synchronous and asynchronous ones. You can read them here. ASIDE: For those so-called blogs that do not politely implement pingback/Trackback, do not pretend to be self-proclaimed as a blog service provider (BSP.

In CNBlogsDottext10Beta2, The TRACKBACK function is blocked, probably because many people encounter an error when submitting a POSTS containing a reference link after successful installation:

Truncates string or binary data.

In fact, this is because the key method for sending TRACKBACK: SendPing (string trackBackItem, string parameters) sends byte streams according to the length of the ASCII code. When PARAMETERS contains Chinese characters, there will be errors, the solution is to convert to UTF-8 to send, below is my modified code:

Private void SendPing (string trackBackItem, string parameters)
{

HttpWebRequest request = BlogRequest. CreateRequest (trackBackItem );
Request. Method = "POST ";

Request. ContentType = "application/x-www-form-urlencoded ";
Request. KeepAlive = false;

Byte [] buff = Encoding. GetEncoding ("UTF-8"). GetBytes (parameters );

Request. ContentLength = buff. Length;

Stream reqStream = null;
Try
{
ReqStream = request. GetRequestStream ();

ReqStream. Write (buff, 0, buff. Length );
}
Catch (Exception e)
{
Logger. LogManager. CreateExceptionLog (e, "SendPing Exception ");
}
Finally
{
ReqStream. Close ();
}

First, let's take a look at how to send TRACKBACK: The portal is Dottext. Framework. EntryHandling. Process.

  • Check whether the remote webpage link is included in the article.
  • Download the HTML code from the web page of the remote link. If no HTML code is obtained, the URL is invalid.
  • Check whether the link in this article has been included in the obtained HTML code. It indicates that the link has been pinged.
  • In the HTML code, the link to be TRACKBACK is obtained based on the TRACKBACK standard (the link is included in the annotated XHTML code with the key value of RDF), which completes the conversion from the webpage to the TRACKBACK link.
  • Send (PING) TRACKBACK.

    Let's take a look at the process of receiving TRACKBACK. The entry is Dottext. Framework. Tracking. TrackBackHandler. ProcessRequest.

  • Obtain the ID of the local article based on the pinged TRACKBACK link. If no ID is obtained, the link is invalid.
  • Whether the REQUEST method is POST or not is returned. This is specified by the TRACKBACK standard.
  • Retrieves data from the database based on the ID number and generates an ENTRY object.
  • Download the HTML code of the remote webpage based on the uploaded URL. If the obtained HTML code does not contain a link to a local article, it indicates it is not a legal link and you need to return
  • Analyze the page title of the other party from the HTML code. If no, return
  • Generate a new ENTRY object and assign values to its attributes.

    From this we can see that the TRACKBACK sending efficiency of DOTTEXT is relatively low. The reason is that you need to download remote HTML, which is a very time-consuming task, not to mention extracting the TRACKBACK link from a very large HTML code.

    In addition, no blocking mechanism is set up when TRACKBACK is received. Thus, it is impossible to avoid the attacks of advertising, that is, spam comment.

    To solve these problems, I need to change the send MECHANISM OF THE send TRACKBACK.

  • TRACKBACK is no longer automatically obtained according to the TRACKBACK standard, which is not only extremely inefficient, but many websites do not support this standard (for example, www.blogchinese.com directly provides reference notices instead of hiding them in the webpage, haha, the TRACKBACK operation fails. We assume that the user inputs a valid TRACKBACK link. Send directly.
  • To give users a valid TRACKBACK address, the TRACKBACK link of this article is displayed after the content of each article.
  • Provide another page and enter the webpage link to display the TRACKBACK link of the webpage to continue supporting those websites that comply with the standards.

    When receiving TRACKBACK, we make the following changes:

  • Get the URL of the other party to the library for verification, and check whether the other party has pinged. Because it is performed locally, the speed will be very fast.
  • Create a BLACKIP table in the database to verify the IP address of the attacker. In this way, you can block the IP address of the attacker.

    The above is just my idea. Because of the time, we have not implemented it. If you have better suggestions, we can discuss them together.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.