nrgrep函數分析(1)–搜尋流程簡析

最後更新：2018-12-04 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

NR-grep是由智利大學的Gonzalo Navarro開發的一款“快速、靈活的模式串匹配工具”，其優點在於Nrgrep幾乎全部基於BNDM演算法及其擴充，效能隨著搜尋問題複雜度的增加而平滑下降（Agrep劇烈下降）。而且支援精確搜尋和允許錯誤的近似搜尋（grep不支援近似搜尋）；而且Nrgrep將效能平滑性看做是模式複雜度的一個函數，一旦預測到BNDM演算法搜尋的代價太大，他將更換為Shift-And演算法。

但是Nrgrep的一個缺點在於不支援多串匹配，我們的實驗便是想在Nrgrep的基礎上，設計出利用GPU效能提高的多字串匹配程式。

/* nrgrep程式共有33個檔案 */

首先讓我們來分析一下Nrgrep程式的大體執行流程。

一、/* get the options */ and /* some consistency checks */

Nrgrep程式從Shell.c的main (int argc, char **argv)開始，第一步便是擷取使用者輸入的尾碼(如-i等尾碼參數)，這裡調用了Linux中的特有標頭檔getopt.h中的getopt函數（ps：getopt.h為開發Linux帶參數程式提供特有的方便），然後根據不同的尾碼做出不同的switch-case處理(各尾碼作用將在以後分析)。

二、/* get the pattern */

然後，開始對搜尋字串作分析（分析第一個字是否’^’，最後一個字是否’$’）和調用預先處理函數searchPreproc (patt)——在search.c內定義，其參數patt傳遞了一個pattern。該函數的作用為“Preprocesses pat and creates a searchData structure for searching”，為搜尋結構確定、初始化了搜尋類型（如：SIMPLE,EXTENDED,REGULAR）,不同類型指向了不同的處理函數：

=== simplePreproc(pat,tree,pos); esimplePreproc(pat,tree,pos,OptErrors); ===

=== extendedPreproc(pat,tree,pos); eextendedPreproc(pat,tree,pos,OptErrors); ===

=== regularPreproc(pat,tree,pos); eregularPreproc(pat,tree,pos,OptErrors); ===

接著再調用了recPreproc ()函數，該函數定義於record.c:“makes the preprocessing for record handling”。

三、/* search the files */

初始化工作做好後，便可以開始檔案搜尋（同時支援標準輸入stdin，調用fileno(stdin)）。

主要調用的函數有：

bufSetFile (B,f)

if (OptRecNumber || !OptRecPositive)

newm = recSearchRecFile (*files,B,sData);

/* searches the file handled by B for P using R as record

separator, but scans record by record. this is our way to

handle OptRecNumber or !OptRecPositive */

else newm = recSearchFile (*files,B,sData);

/* searches the file handled by B for P using R as record

separator, reports matches as appropriate and returns number

of matches */

其中函數搜尋調用的是searchScan (byte **beg, byte **end, searchData *P)

/* Searches from *beg to *end-1 for P. Returns if it could find

it. In case of returning true, *beg and *end are set to limit

the printable record containing the occurrence of P */

四、/* final report */

輸出最終結果

五、/* clean up */

釋放個資料結構的空間

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

nrgrep函數分析(1)–搜尋流程簡析

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support