PE458 (project euler 458 Permutations of Project), pe458permutations
This article reviews the problems encountered in solving PE458 problems, and introduces the trie, AC automatic machine, and automatic machine simplification algorithms.
Question: Given the seven letters of the project (equivalent to seven different letters ), A string of 10 ^ 12 consisting of these seven letters. Any seven consecutive letters in the string do not contain all seven letters (I .e: any seven consecutive letters are not listed in the project list). The answer modulo 1e9.
Try to solve the problem by using AC automatic mechanism and matrix binary:
First, we thought of building an automatic machine to determine whether a string meets the requirements. assuming that it is constructed, we can construct the transfer matrix tran [I] [j]. When a letter is input in state machine j and transferred to the I state, the value of tran [I] [j] is increased by 1. given column vector x [0] = (1, 0, 0 ,...) ', x [I] = tran * x [I-1], the state of the first component represents the state of the empty string. therefore, x [I] [j] indicates the number of strings whose length is I and whose status is j. consider the desired status and add it together.
How can we construct this automatic machine? Of course, we use the AC automatic machine. That is to say, we first construct a trie, and then put 7! An invalid string is inserted into this trie. In the constructed trie, some States do not have the next state when the input is accepted, so the next step is to fill these states.
Consider an invalid state, no matter what the input is, the next state or itself, because the string is invalid no matter what the input is.
Consider other intermediate states, which correspond to a string s, | s |> 0. we only need to find a string t, | t | <| s |, t is a prefix of s, and t is also the suffix of s, and | t | maximize. obviously, t exists because the empty strings meet the requirements. t is also unique. when s is input with the letter l, if there is no next state, it is equivalent to t's state transfer when the letter l is input. t can be recursively constructed.
int p = 0, q = 0;sfx[1] = 1;for (int i = 0; i < 7; ++i) if (trie[1][i]){que[q++] = trie[1][i];sfx[trie[1][i]] = 1;}else{trie[1][i] = 1;}while (p < q){int curr = que[p++];assert(danger[sfx[curr]] == 0);danger[curr] = danger[curr] || danger[sfx[curr]];if (danger[curr]){for (int i = 0; i < 7; ++i)trie[curr][i] = curr;}else{for(int i = 0; i < 7; ++i) if (trie[curr][i]){que[q++] = trie[curr][i];sfx[trie[curr][i]] = trie[sfx[curr]][i];}else{trie[curr][i] = trie[sfx[curr]][i];}}}
Status 1 indicates an empty string. if the string corresponding to status I is s, the state sfx [I] is t. danger [I] indicates that the status is invalid. when trie [I] [j] = 0, it indicates that trie has no edge when j is input. we can see that this is a simple bfs, and further look at the fact that the ac automatic mechanism is the expansion of the kmp algorithm on multiple strings.
Then construct the tran matrix. The second part of the matrix can be calculated as x [1e12]. Add valid components.
However, the cup has 13700 states, and the matrix multiplication of n ^ 3 is not feasible on general machines. (It takes 5 hours to calculate 6.5204*10 ^ 12 on my machine. assume that each operation of matrix multiplication is the same as the metering time of my reference operation. The calculation of a single matrix multiplication is 40% of the calculation of my parameter. It may be set to 2 hours. consider that the binary value of 1e12 is about 40 BITs, which takes 80 hours in total. if you consider the differences between the two basic operations, the time for caching is worse)
Because the number of statuses of directly constructed automatic machines is too large, it is not feasible.
Try to reduce the status of the automatic machine:
I have adopted three methods to reduce the status.
The first is the classic algorithm for simplifying automatic machines.
Belong [I] indicates the State after I state is simplified. In initial conditions, the value of the invalid state is 2, and the value of the valid state is 1.
Given an infinite loop:
Map <vector <int>, vector <int> mem; is used to record the current transfer.
For state I, the first component of the key is belong [I], the other component is belong [trie [I] [0], belong [trie [I] [1],
..., The vector corresponding to belong [I] [6]. value contains I. This way, all States are distinguished by keys.
Then traverse the map again. For a key, allocate the same belong value for all States corresponding to the value.
If the number of States does not change before and after the allocation, the algorithm terminates.
for (int i = 1; i < top; ++i) if (danger[i] == 0){belong[i] = 1;}else{belong[i] = 2;}int size = 2;for (;;){map<vector<int>, vector<int> > mem;for (int i = 1; i < top; ++i){vi key;key.pb(belong[i]);for (int j = 0; j < 7; ++j)key.pb(belong[trie[i][j]]);mem[key].pb(i);}int id = 1;for (auto& it: mem){for (auto& s: it.second)belong[s] = id;++id;}if (mem.size() == size)break;size = mem.size();}
The algorithm is actually very fast. in the dfa course on coursera, an algorithm is also provided. the complexity of an algorithm is the square of the number of input states. The advantage is that an algorithm with definite complexity is provided. The disadvantage is that the square level is too slow, so it is better to iterate like this.
After simplification, the number of States is 8661, which is still not feasible.
The second step is to try to reduce some States when constructing an automatic machine.
Considering that all the leaf nodes of trie are invalid, You can regard these points as one when constructing trie, and then use the simplified algorithm of the automatic machine. after simplification, the number of States is still 8661. Instead, it verifies that the automatic machine has a minimum number of States. To some extent, it indicates that the simplification algorithm I wrote is probably correct.
Finally, we try to directly construct an automatic machine without considering trie, so that we can make full use of the question conditions and reduce the state.
Consider a status, corresponding to string s with a length of l. When Facing input I, if I does not appear in s, append I to l, and then move to a new state. if I appears in s and the position is pos, extract the pos + 1 string from s and append I to the end of the obtained string, and transfer to this status.
Struct Pt {int s; // indicates the state vector in the automatic machine <int> v; // indicates the string corresponding to this state}; queue <Pt> last; map <vector <int>, int> mem; // The key is a string, and the value is the corresponding state. // put an initial state in last, Pt. s = 1, Pt. v is empty // mark status 1 as accessed // Add the empty string to mem for (int len = 0; len <= 5; ++ len) {queue <Pt> next; For each status in last, now: enumeration, input the letter I if I is not in now. v appears, it is transferred to string B. If B exists, the corresponding status is found. Otherwise, a new status is allocated and the status is added to the next queue. indicates status transfer. if I is in now. v appears, and a string B can also be constructed. B must have appeared, marking state transfer. last. swap (next);} // The status with a length of 6 in last, and the transition from a state with a length of 6 to a State with a length of 7
Now let's take a look at the number of statuses:
0: 1
1! * C (7,1) = P (7,1) = 7
2: 2! * C (7,2) = P (7,2) = 42
3: 3! * C (7,3) = P (7,3) = 210
4: 4! * C (7,4) = P (7,4) = 840
5: 5! * C (7,5) = P (7,5) = 2520
6: 6! * C (7,6) = P (7,6) = 5040
7: 1
1 + 7 + 42 + 210 + 840 + 2520 + 5040 + 1 = 8661
If no special processing is performed for a string with a length of 7:
1 + 7 + 42 + 210 + 840 + 2520 + 5040 + 5040 = 13700
The same results are obtained in the three methods of simplification, and the lower bound of the number of states of this automatic machine is also verified as 8661. this shows that the simplification of the number of States only results in invalid nodes. the third algorithm that directly gives the automatic mechanism further illustrates the connotation of the AC automatic mechanism corresponding to trie.
[In fact, before implementing this algorithm, I mistakenly estimated the number of states and dropped the number of combinations. The length is only 6 to 720, in addition, it is feasible to add a matrix multiplication of no more than, ^ 3]
What is the positive solution:
After hitting the wall above, I suddenly realized that I had deleted a lot of code, kept some of it, and completed less than 10 lines of code to solve this problem. because PE's game rules are not detailed here.