JSP Sensitive word Filter

Source: Internet
Author: User

JSP sensitive words filter most forums, websites, etc., in order to facilitate the management, have made a set of sensitive words. In most Web sites, sensitive words are generally referred to as sensitive political tendencies (or anti-party tendencies), violent tendencies, unhealthy words or uncivilized language, and some websites, based on their own realities, set some special sensitive words that apply only to this website. For example, when you post with some pre-set words, this sticker is not issued.  Or the word is automatically replaced with an asterisk (*) or a cross (X), etc., or it is harmonized. In my opinion, sensitive word filtering is most important in writing filter vocabulary algorithm, how to filter out a large number of sensitive words, I feel good thinking DFA introduction

In the implementation of text filtering algorithm, DFA is the only better implementation algorithm. The DFA is deterministic finite automaton, which is determined to have a poor automaton, it is through the event and the current state to get the next state, that is, event+state=nextstate. Shows the transformation of its state

In this image, capital letters (S, U, V, Q) are states, and lowercase A and B are actions. We can see the following relationship through

A b b
s-----> U S-----> v u-----> V

In the algorithm that implements the filtering of sensitive words, we have to reduce the computation, and the DFA has little computation in the DFA algorithm, and some are just the transformation of the state.

Java implementation of DFA algorithm for sensitive word filtering

The key to implement sensitive word filtering in Java is the implementation of the DFA algorithm. First of all, we analyze. In this process, we think that the following structure will be more clearly understood.

At the same time there is no state transition, no action, some just query (find). We can assume that through the S query U, V, through the U query V, p, through the V query U p. With this transition we can transform the state into a lookup using the Java collection.

Admittedly, there are a few sensitive words to join in our sensitive thesaurus: Japanese, Japanese, Mao Ze dong. So what kind of structure do I need to build into?

First: Query Day---> {this}, query ben--->{person, devil}, query person--->{null}, query Ghost---> {child}. form the following structure:

Let's expand on this diagram:

This allows us to build our sensitive thesaurus into a tree similar to one, so that when we judge whether a word is a sensitive word, we greatly reduce the scope of the search. For example, we have to judge the Japanese, according to the first word we can confirm the need to retrieve the tree, and then in this tree to search.

This idea is reserved for later use, I first write a filter vocabulary of some simple methods, not related to the algorithm

Java code Implementation basic idea: Rewrite Httpservletrequestwrapper in the GetParameter method, let the user input word through this filter, write a class inherit him, rewrite method, write a dictionary of filter words, to compare with the input

First write a JSP page, JS is to use Ajax to refresh, recently learn to use to try, feel good, Ajax is required to draw JS file

<Body>    <inputtype= "text"name= "word"onblur= "filter (this.value);"ID= "Filter"/>    <inputtype= "Submit"value= "Sensitive word filter" /><Scripttype= "Text/javascript"src= "Js/jquery.js"></Script><Scripttype= "Text/javascript">    functionfilter (num) {$.ajax ({type:"Post",//How to submitURL:"Filterwordservlet", Async:true,//whether to request asynchronouslyDataType:"HTML",//return type of datadata:{"Num": num},//data passing past the valueSuccess:function(data,textstatus) {//after a successful execution, the callback function handles the transaction                $("#filter"). val (data); }, Error:function(){//failed to execute this function to handle the failed transactionAlert ("Error"); }        })    }</Script></Body>

Re-inheriting Httpservletrequestwrapper in overriding the GetParameter method,

//The main idea is to inherit Httpservletrequestwrapper, to rewrite his getparameter method, to have a filtered business Public classWordfilterextendshttpservletrequestwrapper{ PublicWordfilter (HttpServletRequest request) {Super(Request); //TODO auto-generated Constructor stub} @Override Publicstring GetParameter (string name) {//first get the father's method, pass in the value, get the value in the filter dictionary compared to see if it contains, there is replaced, no Miss returnString word=Super. GetParameter (name); //call the text in the dictionaryList<string> list=words.getlist ();  for(String string:list) {//determine if the text in this dictionary is included            if(Word.contains (String)) {//Replace the string containing theWord=word.replace (String, "* *"); }        }        returnWord; }    }

Write a servlet to get the word that the user input, filter the operation

@WebServlet ("/filterwordservlet") Public classFilterwordservletextendsHttpServlet {Private Static Final LongSerialversionuid = 1L; protected voidDoget (HttpServletRequest request, httpservletresponse response)throwsservletexception, IOException {//set request and encoding formatsRequest.setcharacterencoding ("Utf-8"); Response.setcharacterencoding ("Utf-8"); //Create your own request method, inherit with the original, rewrite the GetParameter method to let it have filtered businessWordfilter wfilter=NewWordfilter (Request); String String=wfilter.getparameter ("num"); System.out.println ("---------------"); //the response method for out. Output on the page, let Ajax get this to deal with some businessPrintWriter out=Response.getwriter ();             Out.println (string); }        protected voidDoPost (HttpServletRequest request, httpservletresponse response)throwsservletexception, IOException {//TODO auto-generated Method Stubdoget (request, response); }}

I built a list of the filter vocabulary, and later modify the word from here to start the search algorithm, first build a vocabulary class

 Public classWords {//Dictionary of filter Vocabulary    StaticList<string> list=NewArraylist<>(); Static{List.add ("Your sister's."); List.add ("SB"); List.add (Roller); }     Public StaticList<string>getList () {returnlist; }     Public Static voidSetlist (list<string>list) {Words.list=list; }    }

It's just one of the basics. Advanced implementation of the need algorithm, you can think about this DFA algorithm, I feel very good

JSP Sensitive word Filter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.