Due to the complexity of thinking about the first version, the idea was not very fresh. As a result, some problems were not solved. during the Chinese New Year, I thought a lot about it and still did not solve it. I always wanted to give up the solution, but I have been doing so much, and I have been thinking for so long. I don't want to start over again, but at the end I still haven't found a solution that is both efficient and has no obvious bugs, finally, I chose to give up the solution in the first version. Today I want to implement it using the search tree based on the column. I can write some code and find that there are still some problems, of course, it does not mean that there is a problem with the implementation of the search tree based on the column, but it is a bit difficult for me. Even if it is implemented, it is definitely not efficient. Finally, I gave up the search tree solution. The last thought of a solution is the solution described in this blog. The idea is very simple. Based on the column, the first word of each keyword is used as the key, and the key word is used as the value, scatter all keywords in a dictionary <key, value>. Because a keyword may correspond to multiple keywords, value is actually a set of keywords. By traversing the content to be filtered, match the keyword dictionary, and filter out the matching words. Because the idea is simple and clear, there may be few bugs, and more than 90 lines of code are implemented,Code that implements keyword Filtering90Multiple rows. No!The efficiency is also good. The keywords and content to be filtered are more than 10 thousand characters, and the time is only 10 milliseconds. The two groups of data are read from the notepad.
public string Filter(string str) { if (string.IsNullOrEmpty(str)) { return string.Empty; } int len = str.Length - 1; char[] sb = str.ToCharArray(); bool isOK = true; for (int i = 0; i <= len; i++) { if (keyDict.ContainsKey(str[i])) { int j = i; foreach (string s in keyDict[str[i]]) { foreach (char c in s) { if ( j >= len || c != str[j++]) { isOK = false; break; } } if (isOK) { for (int k = i; k < j; k++) sb[k] = ‘*‘; i = j; break; } else { j=i; isOK = true; } } } } return new string(sb); }
Test:
Chen taihan
Overwrite keywords for column sharding