IIS7+ 中通過Native HttpModule修改HTML

來源:互聯網
上載者:User

原文地址: http://blog.csdn.net/wangjia184/article/details/17919667


突然來了這樣一個任務:根據訪問者的IP地理位置,在頁面中使用不同的CDN網域名稱來加速內容(樣式/圖片/等)。為了避免改動太多現有代碼,最簡單的方式就是通過HttpModule在IIS伺服器返回HTTP響應前對尋找到內容中的資源url網域名稱並進行替換。


這個問題初看上去很簡單,但其實上有很多麻煩。



1. 分隔的chunk

伺服器在返回的時候是將響應分隔成了多個chunk,那麼有可能目標字串被分隔到了2個chunk中。 

例如,//static.xxxxxxxx.com是尋找的目標字串,可能分布成的情況,這給尋找帶來了很大的不便。



2. 效能

因為這個處理過程會針對所有響應觸發,因此應該採用最高效的方式實現,否則對效能的影響很嚴重。

在Rick Strahl的《Capturing and Transforming ASP.NET Output with Response.Filter》一文中,為瞭解決第一個問題,他採用託管的HttpModule, 在將多個chunk合并到一起後再進行處理。這樣雖然解決了問題,但是對效能造成很大的不利影響。實際上如果某個chunk中沒有出現目標字串的情況下可以完全不用進行處理。

其次,對於字串的尋找可以使用更高效的Boyer-Moore演算法。

最後,我決定使用Unmanaged 程式碼實現,這樣的話效率是最高的。

工程搭建.

從IIS7開始,非託管Module從以前的ISAPI變成了C++ Module,  首先下載 


下載的代碼中就是一個最簡單的HttpModule的工程,需要注意的是,如果伺服器是Windows 2008 R2之類的,IIS基本上是運行在x64模式下的(除非特別設定),這個時候應該將工程屬性修改成x64。 而且編譯的時候採用靜態連結,免得伺服器上某個依賴項找不到。
在main.cpp中, 有匯出方法  RegisterModule 
HRESULT__stdcallRegisterModule(    DWORD                           dwServerVersion,    IHttpModuleRegistrationInfo *   pModuleInfo,    IHttpServer *                   pHttpServer){    HRESULT                             hr = S_OK;    CPostProcessHttpModuleFactory  *             pFactory = NULL;    if ( pModuleInfo == NULL || pHttpServer == NULL )    {        hr = HRESULT_FROM_WIN32( ERROR_INVALID_PARAMETER );        goto Finished;    }    // step 1: save the IHttpServer and the module context id for future use    g_pModuleContext = pModuleInfo->GetId();    g_pHttpServer = pHttpServer;    // step 2: create the module factory    pFactory = new CPostProcessHttpModuleFactory();    if ( pFactory == NULL )    {        hr = HRESULT_FROM_WIN32( ERROR_NOT_ENOUGH_MEMORY );        goto Finished;    }    // step 3: register for server events    // TODO: register for more server events here    hr = pModuleInfo->SetRequestNotifications( pFactory, /* module factory */                                               RQ_SEND_RESPONSE /* server event mask */,                                               0 /* server post event mask */);    if ( FAILED( hr ) )    {        goto Finished;    }    pFactory = NULL;Finished:        if ( pFactory != NULL )    {        delete pFactory;        pFactory = NULL;    }       return hr;}

其中最重要的是  SetRequestNotifications 的調用,它註冊需要處理的事件。這裡你需要瞭解在integrated pipeline模式下 各事件的含義及觸發順序。因為需要修改返回,註冊對 RQ_SEND_RESPONSE訊息的偵聽就可以了。

然後在CHttpModule的衍生類別中,重載OnSendResponse方法

class CPostProcessHttpModule : public CHttpModule{public:REQUEST_NOTIFICATION_STATUSOnSendResponse(IN IHttpContext *                       pHttpContext,IN ISendResponseProvider *              pProvider);private:BOOL StringStartsWith(LPCSTR szText, LPCSTR szPrefix, int nMaxLength = 1024000);};REQUEST_NOTIFICATION_STATUSCPostProcessHttpModule::OnSendResponse(    IN IHttpContext *                       pHttpContext,    IN ISendResponseProvider *              pProvider){    UNREFERENCED_PARAMETER( pHttpContext );    UNREFERENCED_PARAMETER( pProvider );            return RQ_NOTIFICATION_CONTINUE;}

工程到這裡就搭建完成了


安裝與卸載在 提升到管理員權限的命令提示字元下,使用下面的命令安裝該http module.
%systemroot%\system32\inetsrv\APPCMD.EXE install module /name:HtmlPostProcessModule /image:G:\IISPostProcessModule\bin\PostProcessModule_x64.dll /add:false
/image: dll的絕對路徑 /name: 安裝的http module的名稱 /add: false, 只安裝不啟用

然後開啟 inetmgr, 找到需要啟用該module的網站,然後進入Modules

在Modules中,點擊Configure Native Modules, 在快顯視窗中,勾上剛安裝的module,這樣就可以了。


如果要卸載,可以使用如下命令

%systemroot%\system32\inetsrv\APPCMD.EXE uninstall module HtmlPostProcessModule
其中HtmlPostProcessModule是安裝的時候給的module name.


獲得HTTP返回

當伺服器返回請求的時候,OnSendResponse函數會被調用。通過下面的代碼可以遍曆所有的chunk

REQUEST_NOTIFICATION_STATUSCPostProcessHttpModule::OnSendResponse(    IN IHttpContext *                       pHttpContext,    IN ISendResponseProvider *              pProvider){IHttpResponse * pHttpResponse = pHttpContext->GetResponse();if( pHttpContext ){HTTP_RESPONSE *pResponseStruct = pHttpResponse->GetRawHttpResponse();if (pResponseStruct){for( int i = 0; i < pResponseStruct->EntityChunkCount; i++) {HTTP_DATA_CHUNK pChunk = &(pResponseStruct->pEntityChunks[i]);if( pChunk->DataChunkType == HttpDataChunkFromMemory ){}// TODO : }}}}

需要注意的事,實際上ChunkType分很多種,這裡只處理了Memory類型的,如果要處理靜態檔案或者緩衝類型的,應該加入相應的代碼處理

typedef enum _HTTP_DATA_CHUNK_TYPE{    HttpDataChunkFromMemory,    HttpDataChunkFromFileHandle,    HttpDataChunkFromFragmentCache,    HttpDataChunkFromFragmentCacheEx,    HttpDataChunkMaximum} HTTP_DATA_CHUNK_TYPE, *PHTTP_DATA_CHUNK_TYPE;

尋找目標字串

對目標字串採用最高效的BM演算法尋找,boost准標準庫中已經有現成的實現了,直接用即可。

#include <boost\algorithm\searching\boyer_moore.hpp>// the search targetchar * szPattern = "//static.xxxxxxxx.com";const int PATTERN_SIZE = strlen(szPattern);boost::algorithm::boyer_moore<char*> bm( szPattern, szPattern + PATTERN_SIZE );char * pStart = (char *)pChunk->FromMemory.pBuffer;char * pEnd = pStart + pChunk->FromMemory.BufferLength;// find out all the appearanceschar * pMatch = pStart;for(;;){pMatch = bm( pMatch, pEnd);if( !pMatch || pMatch >= pEnd )break;// TO DO: // pMatch is the matched address of the stringpMatch += PATTERN_SIZE;if( pMatch >= pEnd )break;}

處理多個Chunk的問題在處理多個chunk的時候,需要考慮目標字串被分隔到2個chunk中的特殊情況。解決的策略是,在處理n個Chunk中的前(n-1)個的時候,匹配該chunk的末尾是否能夠匹配目標字串的前面某一部分,如果可以,則留到下一個chunk處理前再進行一次匹配。
// detect if there could be uncompleted partner at the end of this chunkint nChunkRemaingChars = 0;if( i < pResponseStruct->EntityChunkCount - 1 ){int j = PATTERN_SIZE - 1;for( ; j > 0; j--){char * pFirst = &pStart[pChunk->FromMemory.BufferLength - j];if( StringStartsWith( pFirst, szPattern, j) ){nChunkRemaingChars = j;dwNewSize -= nChunkRemaingChars; // the end part is moved to next chunk to processbreak;}}}

修改Chunk當需要修改Chunk的時候,通過 IHttpContext::AllocateRequestMemory重新分配記憶體,然後直接將Chunk的指標和大小修改.
LPBYTE pBuffer = (LPBYTE)pHttpContext->AllocateRequestMemory(dwNewSize);// TODO : modify the new chunk// Set back the new chunk pointpChunk->FromMemory.pBuffer = pBuffer;pChunk->FromMemory.BufferLength = dwNewSize;
這裡沒有使用WriteEntityChunks方法來寫入新的Chunk,而是直接修改。MSDN上說使用該方法一個chunk的大小最大隻能為65534。 而通過直接修改chunk的方法,我測試過,一次寫入650K都不成問題。
核心部分完成代碼
BOOL CPostProcessHttpModule::StringStartsWith(LPCSTR szText, LPCSTR szPrefix, int nMaxLength /* = 1024000 */){for( int i = 0; i < nMaxLength; i++){if( szPrefix[i] == 0 )return TRUE;if( szText[i] != szPrefix[i] )return FALSE;}return TRUE;} REQUEST_NOTIFICATION_STATUSCPostProcessHttpModule::OnSendResponse(    IN IHttpContext *                       pHttpContext,    IN ISendResponseProvider *              pProvider){    UNREFERENCED_PARAMETER( pHttpContext );    UNREFERENCED_PARAMETER( pProvider );IHttpResponse * pHttpResponse = pHttpContext->GetResponse();if( pHttpContext ){pHttpResponse->WriteEntityChunksHTTP_RESPONSE *pResponseStruct = pHttpResponse->GetRawHttpResponse();if (pResponseStruct)        {PCSTR pszContentType;USHORT cchContentType;pszContentType = pHttpResponse->GetHeader( HttpHeaderContentType, &cchContentType);if( pszContentType ){if( StringStartsWith( pszContentType, "application/json")  ||StringStartsWith( pszContentType, "text/html") ){char * szPattern = "//static.xxxxxxxx.com";char * szReplace = "//cdn.xxxxxxxx.com";const int PATTERN_SIZE = strlen(szPattern);const int REPLACE_SIZE = strlen(szReplace);int nLastChunkChars = 0;boost::algorithm::boyer_moore<char*> bm( szPattern, szPattern + PATTERN_SIZE );for( int i = 0; i < pResponseStruct->EntityChunkCount; i++){std::vector<int> lstAppearance;char * pStart = NULL; char * pEnd = NULL;PHTTP_DATA_CHUNK pChunk = &(pResponseStruct->pEntityChunks[i]);if( pChunk->DataChunkType == HttpDataChunkFromMemory ){if( pChunk->FromMemory.BufferLength > 0 ) {pStart = (char *)pChunk->FromMemory.pBuffer;pEnd = pStart + pChunk->FromMemory.BufferLength;// caculate the new buffer sizeBOOL bHasUncompletedPartner = FALSE;DWORD dwNewSize = pChunk->FromMemory.BufferLength;BOOL bRequireModification = FALSE; // flag indicating if this chunk need be modified// if there is uncompleted partner from the end of last chunkif( nLastChunkChars > 0 ){// detect if (the end of last chunk + start of this chunk) matches the partnerif( StringStartsWith( pStart, szPattern + nLastChunkChars, PATTERN_SIZE - nLastChunkChars) ){bHasUncompletedPartner = TRUE;dwNewSize = dwNewSize - (PATTERN_SIZE - nLastChunkChars) + REPLACE_SIZE;}else{dwNewSize += nLastChunkChars;}bRequireModification = TRUE;}// find out all the appearanceschar * pMatch = pStart;if( bHasUncompletedPartner )pMatch = pMatch + nLastChunkChars; // skip the begin part if (the end of last chunk + start of this chunk) matches the partnerfor(;;){pMatch = bm( pMatch, pEnd);if( !pMatch || pMatch >= pEnd )break;lstAppearance.push_back( (int)(pMatch - pStart) );pMatch += PATTERN_SIZE;if( pMatch >= pEnd )break;}if( !lstAppearance.empty() ){dwNewSize += lstAppearance.size() * ( REPLACE_SIZE - PATTERN_SIZE);bRequireModification = TRUE;}// detect if there could be uncompleted partner at the end of this chunkint nChunkRemaingChars = 0;if( i < pResponseStruct->EntityChunkCount - 1 ){int j = PATTERN_SIZE - 1;if( lstAppearance.size() > 0 ){std::vector<int>::iterator iter = lstAppearance.end();iter--; // the last matched positionint nRemaining = pChunk->FromMemory.BufferLength - ( *iter + PATTERN_SIZE );if( j > nRemaining )j = nRemaining;}for( ; j > 0; j--){char * pFirst = &pStart[pChunk->FromMemory.BufferLength - j];if( StringStartsWith( pFirst, szPattern, j) ){nChunkRemaingChars = j;dwNewSize -= nChunkRemaingChars; // the end part is moved to next chunk to processbRequireModification = TRUE;break;}}}if( bRequireModification ) {LPBYTE pBuffer = (LPBYTE)pHttpContext->AllocateRequestMemory(dwNewSize);ATLASSERT(pBuffer);int nOffset = 0; // store the written range of the new bufferint nLastEnd = 0; // the end position of last match in orginal bufferif( pBuffer ) {if( bHasUncompletedPartner ){ // insert the replace text if (the end of last chunk + start of this chunk) matches the partnermemcpy_s( pBuffer, dwNewSize, szReplace, REPLACE_SIZE);nOffset = REPLACE_SIZE;nLastEnd = PATTERN_SIZE - nLastChunkChars;}else if( nLastChunkChars > 0 ) {memcpy_s( pBuffer, dwNewSize, szPattern, nLastChunkChars);nOffset = nLastChunkChars;}nLastChunkChars = nChunkRemaingChars;if( !lstAppearance.empty() ) {std::vector<int>::iterator iter;for( iter = lstAppearance.begin(); iter != lstAppearance.end(); iter++) {int nPos = *iter;if( nPos > nLastEnd ) {memcpy_s( pBuffer + nOffset, dwNewSize - nOffset, pStart + nLastEnd, nPos - nLastEnd);nOffset += nPos - nLastEnd;}memcpy_s( pBuffer + nOffset, dwNewSize - nOffset, szReplace, REPLACE_SIZE);nOffset += REPLACE_SIZE;nLastEnd = nPos + PATTERN_SIZE;}}if( nOffset < dwNewSize ){memcpy_s( pBuffer + nOffset, dwNewSize - nOffset, pStart + nLastEnd, dwNewSize - nOffset);}pChunk->FromMemory.pBuffer = pBuffer;pChunk->FromMemory.BufferLength = dwNewSize;}}// if( bRequireModification )}// if( pChunk->FromMemory.BufferLength > 0 )}// if( pChunk->DataChunkType == HttpDataChunkFromMemory )}// for( int i = 0; i < pResponseStruct->EntityChunkCount; i++)}}        }}            return RQ_NOTIFICATION_CONTINUE;}


原文地址: http://blog.csdn.net/wangjia184/article/details/17919667


相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.