javascriptMVC教程 — 12.可供google爬取和搜尋的Ajax應用（Searchable Ajax Apps）

最後更新：2018-12-07 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

　　今天的教程我們將建立一個小外掛程式，用它來監聽瀏覽器地址欄hash的改變，然後更新頁面內容。我們也將展示，如何讓ajax應用能被google爬取和搜尋到。你可以到https://developers.google.com/webmasters/ajax-crawling/?hl=zh-CN瞭解更多資訊。

　　安裝

　　下載安裝最新版本的javascriptMVC（所謂安裝就是把壓縮包解壓在一個目錄即可，然後為該目錄建立iis網站或者虛擬目錄）。開啟命令列，指向到javascriptMVC根目錄，使用下面的命令建立應用：

js jquery/generate/app ajaxy

　　我本地的運行效果：

　　代碼

　　在產生的ajaxy檔案夾下面，你會發現ajaxy.html 和 ajaxy.js。我們會為ajaxy.html檔案添加下面的代碼，這樣當點擊頁面的連結的時候，確保ajaxy.js已經載入完畢。

<!DOCTYPE HTML><html lang="en">    <head>        <title>Ajaxy</title>        <meta name="fragment" content="!">    </head>    <body>        <a href='#!videos'>Videos</a>        <a href='#!articles'>Articles</a>        <a href='#!images'>Images</a>        <div id='content'></div>        <script type='text/javascript'             src='../steal/steal.js?ajaxy,development'>             </script>    </body></html>

　　我們注意到頁麵包含<meta name="fragment" content="!">標記，他會告訴google把ajaxy.html作為ajax內容頁面處理。接下來為頁面的連結添加相應的檔案：

ajaxy/fixtures/articles.html

<h1>Articles</h1><p>Some articles.</p>

ajaxy/fixtures/images.html

<h1>Images</h1><p>Some images.</p>

ajaxy/fixtures/videos.html

<h1>Videos</h1><p>Some videos.</p>

　　下面是ajaxy.js檔案的內容：

steal('jquery/controller',      'jquery/event/hashchange',       'steal/html',function(){$.Controller('Ajaxy',{    init : function(){        this.updateContent()    },    "{window} hashchange" : function(){        this.updateContent();    },    updateContent : function(){        var hash = window.location.hash.substr(2),            url = "fixtures/"+(hash || "videos")+".html";        // 延時        steal.html.wait();        $.get(url, {}, this.callback('replaceContent'),"text" )    },    replaceContent : function(html){        this.element.html(html);        // 確定頁面準備被爬取        steal.html.ready();    }})$('#content').ajaxy();});

　　當hashchange("{window} hashchange")事件發生的時候，ajaxy使用window.location.hash的值向fixtures檔案夾請求內容。內容擷取完畢，替換到元素的html(this.element.html(...))。ajaxy使用updateContent在頁面初始化的時候載入內容。

　　爬取（Crawling and scraping）

　　我們使用下面的命令去爬取網站，產生可供google爬取和搜尋的html頁面。

js ajaxy/scripts/crawl.js

　　上面的命令完成了如下操作：

在瀏覽器中開啟頁面。
等待，知道內容準備完畢。
爬取頁面內容。
輸入內容到別的檔案。
在頁面中添加任意以 “#!”開頭的連結，使之能被索引到。
修改 window.location.hash成下一頁
去 #2 ，重複操作，直到所有頁面載入完畢

　　暫停爬取（Pausing the html scraping）

　　預設情況下，當頁面指令碼載入完畢或者window.location.hash改變之後，頁面內容被毀爬取。因為ajax請求時非同步，所以我們需要告訴steal.html等待內容擷取之後再爬取。ajax請求之前，我們需要ajaxy調用wait：

steal.html.wait();

　　頁面準備工作完成之後，ajaxy調用ready：

steal.html.ready();

　　讓google爬取你的網站

　　你可以到Ajax crawling API瞭解更多資訊。當google爬取你的網站的時候，他會發送帶有“_escaped_fragment=”的請求。當你網站發現這個參數，他會引導google訪問產生的頁面。例如，google爬蟲請求 http://mysite.com?_escaped_fragment=val ，他將試圖爬取 http://mysite.com#!val ，你需要指引他到 http://mysite.com/html/val.html。是的，一切就是這麼簡單。

　　進階頁面鏡像（Phantom for Advanced Pages）

　　一般情況下，爬行指令碼會使用EnvJS來開啟你的頁面並產生靜態快照。但是他有時候會出錯，這時你需要使用PhantomJS (無頭Webkit)產生快照，他做的更好。

　　如何運行：

　　1.使用命令安裝，在這裡你可以瞭解更多資訊，here。

　　2.開啟scripts/crawl.js ，修改 steal.html.crawl的第二個參數，他是一個瀏覽器配置項，例如：

steal('steal/html', function(){ steal.html.crawl("ajaxy/ajaxy.html", { 　　　　out: 'ajaxy/out', 　　　　browser: 'phantomjs' 　　}) })

javascriptMVC教程目錄

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More