WEBUS2.0 In Action

Source: Internet
Author: User

Recently, due to work needs, we need to analyze a large number of C # code and search for specific keywords in tens of thousands of cs files. this is a very time-consuming task. It takes nearly half an hour to use Notepad ++. so I used WEBUS2.0 SDK to create a code searcher program, which is very convenient to complete this work.

BtnOpen_Click (folderBrowserDialog1.ShowDialog () = Task. Factory. StartNew (IndexProc );

        var files = Directory.GetFiles(folderBrowserDialog1.SelectedPath, "*.cs", SearchOption.AllDirectories);             (files !=  && files.Length > //... ( file Document doc //... (m_Index != =  IndexManager(m_Index.DumpDocs = 3000;            m_Index.DumpSize = 10;            m_Index.MinIndexSize = int.MaxValue;            m_Index.MaxIndexSize = int.MaxValue;            m_Index.MergeFactor = int.MaxValue;            m_Index.New(AppDomain.CurrentDomain.BaseDirectory + = 

By adjusting DumpDocs and DumpSize, the memory usage of the program can be optimized;

By adjusting Min/MaxIndexSize and MergeFactor, I/O performance of the program can be optimized. Currently, the maximum MinIndexSize I set means that only one index segment is generated from start to end. The maximum value of MergeFactor means that index segments are never merged.

When creating an index, we use the IAnalyzer specifically designed for code analysis:

     (field.Name ==   HashSet<string> stops = new HashSet<string>(new string[] {             "abstract",            "event",                "new",                             "enum",                "namespace",                "string"        });        Queue<Token> m_Buffer =  Queue<Token> CodeTokenStream(= Regex.Matches(text,  (Match m  key =if (stops.Contains(key) == false)                {                    m_Buffer.Enqueue(new Token(key, m.Index, m.Length));                }            }        }        

This analyzer contains all the C # keywords. Because they are absolute high-frequency words and have no meaning to search, these words will be skipped during analysis without any processing.

Update the status to the UI during indexing:

          frmCodeSearch_Load(this.StatusChanged += new StatusChangeEventHandler(frmCodeSearch_StatusChanged);                  frmCodeSearch_StatusChanged( sender, this.Invoke(new UpdateUI(() => { this.txtStatus.Text = status; }));        }

The UI is updated across threads. Therefore, you need to use this. Invoke to mail the corresponding operation.

You can start searching during indexing:

TxtKeyword_TextChanged (TermQuery query= New TermQuery (new Term ("Code", txtKeyword. text. toLower (); var hits = m_Searcher.Search (query); List <SearchResult> result = new List <SearchResult> (); foreach (HitDoc hit in hits) {StandardHighlighter hl = new StandardHighlighter (hit); result. add (new SearchResult (hit);} dgvResult. dataSource =Result;}

Create a TermQuery object, search for the Code field, create a result set of the List <SearchResult> type, and bind it to the DataGridView dview! Enjoy ~!

Download source code

Read more WEBUS2.0 SDK articles

The search function is enhanced and the query expression of WEBUS2.0 SDK is supported to complete various complex search tasks. The specific syntax will be described in the following article.

Supplement:

Build-select a folder to start indexing. The compiled indexes are automatically saved in the CodeSearch in the current directory. under the Index subdirectory. for example, we select C: \ SourceCode to compile the index. The index data is saved in C: \ SourceCode \ CodeSearch. index.

Open-Open an existing Index, that is, the CodeSearch. Index folder mentioned above.

When the program is closed, the current index is automatically disabled. After the index is closed, all data is saved to the disk. You can continue using the index next time.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.