asp.net dirty Word filtering algorithm _ practical skills

Source: Internet
Author: User
See http://www.jb51.net/article/20575.htm in the text
But when I test here, the regex is about one times faster. But still not very satisfied, because we use a lot of dirty word on the site, the efficiency has some impact, after some thinking, they made an algorithm. Test on their own machine, the use of the original text of the dirty font, 0x19c string length, 1000 cycles, text lookup time 1933.47ms,regex used 1216.719ms, and my algorithm only used 244.125ms.
Update: A new BitArray is used to determine whether a char has ever appeared in all dirty words. The total time was reduced from 244ms to 34ms.
The main algorithm is as shown in the code
Copy Code code as follows:

private static Dictionary dic = new Dictionary ();
private static BitArray Fastcheck = new BitArray (char. MaxValue);
static void Prepare ()
{
string[] Badwords =//read from File
foreach (string word in badwords)
{
if (!dic. ContainsKey (word))
{
Dic. ADD (word, NULL);
MaxLength = Math.max (maxlength, Word. Length);
Fastcheck[word[0]] = true;
}
}
}

Use of the time
Copy Code code as follows:

int index = 0;
while (Index < target. Length)
{
if (!fastcheck[target[index]])
{
while (Index < target. Length-1 &&!fastcheck[target[++index]]);
}
for (int j = 0; J < math.min (MaxLength, Target. Length-index); J + +)
{
String sub = target. Substring (index, j);
if (DIC). ContainsKey (sub))
{
Sb. Replace (Sub, "* * *", index, J);
Index + j;
Break
}
}
index++;
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.