Tool Javahtmljar
Instructions for use:
1, the toolkit by the Beijing Normal University computer department Zhang Jay Development and production based on multi-fork tree Search, any questions please contact:
[Email protected]
2, the toolkit comes with the word library of sensitive words, the first call to read into the thesaurus, so the first call time may be longer, in the class load after the ordinary PC HTML Filter 5000 words in 80 milliseconds, plain text 35 milliseconds or so.
3, if you need to customize the thesaurus, the jar package into the Web-inf Project Lib directory, in the Web-inf/classes directory to build a utf-8 words.dict text file, in the file in the "keyword = level" way to write, such as:
China *gongchandang=4
Chinese =1
0 is the lowest level, filtered back to the highest level appearing in the original string
Calling method: Wordfilterutil.filterhtml (str, ' * ');
: Http://download.csdn.net/user/ranjio_z
Efficient Java sensitive words, Keyword Filtering Toolkit _ filter Illegal words