Kanji to PinyinSensitive word filteringDisable Word Lookup
These very common Internet needs
What did you do when you were in your debut?
What do you do when you're lazy?
General algorithm-from database to user submissions to match:
- To press all the data in the database into a large array (at this point, the more data, the longer the connection time with the database)
- A large array is foreach based on what the user commits (at this point, the more data, the larger the server memory)
- According to the above results, the corresponding return, processing (at this time the more data, the longer the connection with the database)
Advantages:
- Simple logic, easy implementation, low development cost, low quality of people
Disadvantages:
- Low program efficiency, high database pressure, slow speed
- There are bugs, such as the database in the Forbidden word is shop, but the user submitted is bookshop, then will match to shop, form a bug
On this Christmas Day, http://my.oschina.net/cart/to recommend a new way to solve the reverse thinking :
New algorithm: User-submitted data segmentation algorithm-from user submissions to database to match
- 1. Word segmentation based on user-submitted information
function Cutword ($str) { $temp = array (); $len = Mb_strlen ($str, ' utf-8 '); for ($i =0; $i <= $len; $i + +) { for ($j = $len-$i; $j >0; $j-) { $temp [] = Mb_substr ($str, $i, $j, ' utf-8 '); } } return $temp;} $STR = ' Administrator '; Var_dump (Cutword ($STR));
After the participle, we get the following data, of course, the above algorithm is divided into the last word
If you need a minimum of 2 words, because 1 characters are not solid meaning, generally do not become a disabled word, you can limit a minimum of 2 words .
Words that are too long, such as those that have more than 5 words, will not happen, so you can limit up to 5 characters.
The master is the idea, the principle, the algorithm can change flexibly.
Array (6) { [0]=> string (9) "Administrator" [1]=> string (6) "Management" [2]=> string (3) "Tube" [3] = = String (6) "Manager" [4]=> string (3) "Manager" [5]=> string (3) "member"}
- 2. According to the above participle result, use the transaction batch submit to Redis, or find 1 disabled words, immediately exit, improve efficiency
function Isdisableword ($str) { $redis = new \redis (your IP, your port);//$result = Array ();//$redis->multi (); foreach ($this->cutword ($STR) as $v) {//$result [] = $redis->sismember (' NameList ', $v); if ($redis->sismember (' NameList ', $v)) { return $v; } } $redis->exec ();//return $result; return false;}
$name = Trim (' Franklubinson '); if (Isdisableword ($name) = = = False) {echo ' 1.ok! ‘;} else{Echo ' 1. The Forbidden Word is: '; Var_dump (Isdisableword ($name));}
Advantages:
1. Transaction commit (primary key index, no big data occupy memory), one-time database operation database is low in pressure and running fast
Disadvantages:
1. High quality requirements for developers
Chinese Pinyin Sensitive Words filter disable Word search Improve program efficiency Another way of thinking: reverse thought