How to make search engines crawl AJAX content?

Last Update:2014-12-24 Source: Internet

Author: User

Keywords Search engine crawl http that's it

Tags .url address address bar ajax api application based browser

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Author: Ruan Yifeng

More and more sites are beginning to adopt the "Single-page application."

The entire site has only one web page, using Ajax technology, based on user input, load different content.

The benefits of this approach is the user experience is good, save traffic, the disadvantage is that AJAX content can not be crawled by search engines. For example, you have a website.

http://example.com

Users through the URL structure URL, see the different content.

http: //example.com#1

http: //example.com#2

http: //example.com#3

However, search engines only crawl example.com, will not understand the pound sign, it will not index content.

In order to solve this problem, Google put forward the structure of "pound sign + exclamation point".

http: //example.com#! 1

When Google found such a URL above, it will automatically crawl another URL:

http://example.com/?_escaped_fragment_=1

As long as you put AJAX content on this site, Google will be included. But the problem is that the pound + exclamation point is very ugly and cumbersome. Twitter used this structure, it put

http://twitter.com/ruanyf

Change into

http://twitter.com/#!/ruanyf

As a result, users complain repeatedly that it took only six months to abolish.

So, is there any way to keep search engines crawling AJAX content while maintaining a more straightforward URL?

I always thought I could not do it until I saw the solution of Robin Ward, one of the founding members of Discourse, two days ago.

Discourse is a forum program that relies heavily on Ajax, but must also have Google included. Its solution is to give up the pound structure, the History API.

The so-called History API, refers to not refresh the page, change the browser address bar URL (to be exact, is to change the current state of the page). Here's an example, you click the button above to start playing music. Then, click the link below to see what happened?

The URL of the address bar changed, but the music was played without interruption!

History API details, beyond the scope of this article. Here is simply to say that its role is to add a record in the browser's History object.

window.history.pushState (state object, title, url);

This line of command above, you can make the address bar appear new URL. The pushState method of the History object takes three arguments, the new URL is the third argument, and the first two arguments can all be null.

window.history.pushState (null, null, newURL);

At present, all browsers support this method: Chrome (26.0+), Firefox (20.0+), IE (10.0+), Safari (5.1+), Opera (12.1+).

Here's how Robin Ward works.

First, replace the pound sign structure with the History API so that each pound sign becomes the URL for the normal path so that the search engine crawls every page.

example.com/1

example.com/2

example.com/3

Then, define a JavaScript function, handle the Ajax section, and fetch content based on the URL (assuming jQuery is used).

function anchorClick (link) {
var linkSplit = link.split ('/'). pop ();
$ .get ('api /' + linkSplit, function (data) {
$ ('# content'). html (data);
});
}

Then define the mouse click event.

$ ('# container'). on ('click', 'a', function (e) {
window.history.pushState (null, null, $ (this) .attr ('href'));
anchorClick ($ (this) .attr ('href'));
e.preventDefault ();
});

Also take into account that the user clicks the browser's "forward / reverse" button. This will trigger the popstate event of the History object.

window.addEventListener ('popstate', function (e) {
anchorClick (location.pathname);
});

After the definition of the above three sections of code, you can display the normal path URL and AJAX content without refreshing the page.

Finally, set the server side.

Because you do not use the pound sign structure, each URL is a different request. Therefore, the request of the server for all these requests, are returned to the following structure of the web page, to prevent the emergence of 404 errors.

Look carefully above this code, you will find there is a noscript tag, this is the secret of where.

We put all the content that the search engine should include in the noscript tag. In this case, the user can still perform AJAX operations, without refreshing the page, but the search engine will contain the main content of each page!

Source: http: //www.anyifeng.com/blog/2013/07/how_to_make_search_engines_find_ajax_content.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More