Topic:
Given a string, an array, to determine whether the string can be separated into a dictionary of words.
Using dynamic programming algorithm
I wrote the following code during the interview.
Public Static BooleanDivied2 (String s,string[] dict) {Booleanresult=false;if(s.length () = =0)return true; for(inti =0; i < dict.length; i++) {int Index=s.indexof (Dict[i]);if(Index!=-1) {System.out.println (Index); String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ());returnDivied (tmp1+tmp2,dict); } }returnResult }
But for test cases
String[] dict={"百度一","百度","一下","我就","知","道"}; System.out.println(divied2("百度一下我就知道", dict));
This is a non-pass. Because Baidu first deleted the word, the word was destroyed,
Back to think about it, the reason above is to terminate the traversal. After improvement, the test passes
The original problem is that this |= operation, which means that all the results are performed or manipulated, one can be separated completely.
Public Static BooleanDivied (String s,string[] dict) {Booleanresult=false;if(s.length () = =0)return true; for(inti =0; i < dict.length; i++) {int Index=s.indexof (Dict[i]);if(Index!=-1) {System.out.println (Index); String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ()); Result|=divied (tmp1+tmp2,dict); } }returnResult }
The disadvantage is that time complexity is too high,
String length is M, dictionary size is n
The complexity of time is:
Around n^ (m)
Public Static BooleanDivied (String s,string[] dict) {Booleanresult=false;if(s.length () = =0)return true; for(inti =0; i < dict.length; i++) {Count++;int Index=s.indexof (Dict[i]);if(Index!=-1) {System.out.println (Index); String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ()); Result|=divied (tmp1+tmp2,dict);if(Result) {//Optimization point return true; } } }returnResult }
Optimization ideas. Terminates the loop directly in the case of Result=true.
Add a global variable to see the number of function executions
Without interruption
The function was performed about 180 times (in relation to the sequence of the lyrics).
After the interrupt has been added.
The function executes only 21 times.
In the case where the dictionary order is constantly adjusted, if the string can be separated completely, the function executes only about 30 times. But if not, it will do 374.
Optimization Idea Two:
If each word appears only once in the string, you can delete the word in the dictionary directly after it is found and deleted, so that unnecessary loops can be avoided
Optimization Idea Three:
The fact or operation is designed for a word that has a different length from the beginning of the same character in the dictionary. This can be a bit more specific in the program.
After the improvement, the effect is very good. The time complexity of the two cases is essentially the same, regardless of whether the string can be separated completely.
For the following test cases: can be separated, executed 44 times, can not be the case is 60 times
The complexity of time is reduced to the factorial of the dictionary length.
String[] dict={"百度一","一下","知","我就","百度","道"}; System.out.println(divied("百度一下我就知道", dict));
String[] dict={"百度一","一下","知","我就","百度","道"}; System.out.println(divied("百度一下我后就知道", dict));
Public Static BooleanDivied (String s,string[] dict) {Booleanresult=false;if(s.length () = =0)return true;Charstart=' + '; for(inti =0; i < dict.length; i++) {Count++;int Index=s.indexof (Dict[i]);if(start==' + '&&Index!=-1||Index!=-1&&dict[i].charat (0) {System.out.println (==start) {Index); String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ()); Start=dict[i].charat (0); Result|=divied (tmp1+tmp2,dict);if(Result) {return true; } } }returnResult }
Optimization Idea four:
For the improvement of train of thought three, recursion is performed only in the case of words with repetition, other deletions, continuation loops, no recursion
Public class Divide { Static int Count=0; Public Static BooleanDivied (String s,string[] dict) {Booleanresult=false;if(s.length () = =0)return true;Charstart=' + '; for(inti =0; i < dict.length; i++) {Count++;int Index=s.indexof (Dict[i]);if(start==' + '&&Index!=-1) {String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ()); S=TMP1+TMP2; Start=dict[i].charat (0); }if(Index!=-1&&dict[i].charat (0) {==start) {String tmp1=s.substring (0,Index); String tmp2=s.substring (Index+dict[i].length (), s.length ()); S=TMP1+TMP2; Result|=divied (tmp1+tmp2,dict);if(Result) {return true; } } }returnResult } Public Static voidMain (string[] args) {string[] dict={"Baidu One","Baidu","I will.","a bit","Know","Tao"}; System.out.println (Divied ("Baidu, I know it.", dict)); System.out.println (Count); }}
The final result, the number of cycles that can be completely separated, is 6
For those that cannot be completely delimited, the number of loops is 18
Chinese word segmentation algorithm-Baidu face test