If you are interested in a logarithmic solution, you may have heard of accurate coverage. Given the set Y of a subset of complete x and X, there is a subset of Y y* that makes y* a partition of x.
Here's an example of Python writing.
X = {1, 2, 3, 4, 5, 6, 7}
Y = {
' A ': [1, 4, 7],
' B ': [1, 4],
' C ': [4, 5, 7],
' D ': [3, 5, 6],
' E ': [2, 3, 6, 7],
' F ': [2, 7]}
The only solution to this example is [' B ', ' D ', ' F '].
The exact coverage problem is NP-complete (meaning that there is no fast enough way to find the answer in a reasonable time, meaning polynomial time). The x algorithm is invented and implemented by Gaudena. He proposed an efficient implementation technology called the dance chain, using a two-way linked list to represent the matrix of the problem.
However, the dance chain can be quite cumbersome to implement and is not easy to write correctly. Then it's time to show the Python miracle! One day I decided to write the x algorithm in Python, and I came up with an interesting dance chain variant.
algorithm
The main idea is to use dictionaries instead of doubly linked lists to represent matrices. We've got Y. From it we can quickly access the column elements of each row. Now we also need to generate a reverse table for the row, in other words, to quickly access the row elements from the column. For this to happen, we convert x to a dictionary. In the above example, it should be written as
X = {
1: {' A ', ' B '},
2: {' E ', ' F '},
3: {' d ', ' E '},
4: {' A ', ' B ', ' C '},
5: {' C ', ' d '},
6: {' d '} ', ' E '},
7: {' A ', ' C ', ' e ', ' F '}}
An eagle-eyed reader can notice a slight difference between this and the expression of Y. In fact, we need to be able to quickly delete and add rows to each column, which is why we use collections. On the other hand, Gartner does not mention this, and virtually all the lines in the entire algorithm remain unchanged.
The following is the code for the algorithm.
def solve (x, Y, solution=[]):
if not x:
yield list (solution)
else:
c = min (x, Key=lambda C:len (x[c)))
for r in List (X[c]):
solution.append (r)
cols = select (x, Y, R) for
s in solve (x, Y, solution):
yield s
deselect (x, Y, R, cols)
Solution.pop ()
def select (x, Y, R):
cols = [] for
J. Y[r]: for
i in X[J]: for
K. Y[i]:
if K!= J:
X[k].remove (i)
Cols.append (X.pop (j)) return
cols
def Deselect (X, Y, R, cols): for
J-Reversed (Y[r]):
x[j] = Cols.pop () for
I-x[j]: for
K in y[i]:
if k!= J:
X[k].add (i)
There's really only 30 lines!
format Input
Before we solve the actual problem, we need to convert the input to the format described above. This can be done simply
X = {J:set (filter (lambda i:j in y[i], Y)) for J in X}
But it's too slow. If you set the size of X to M,y is N, the number of iterations is m*n. In this example the number of Sudoku lattice size is N, which requires n^5 times. We have a better idea.
X = {J:set () for J, x} for
I in Y: for
J in Y[i]:
x[j].add (i)
This is still the complexity of O (m*n), but it is the worst case scenario. On average, it will perform much better because it does not need to traverse all the spaces. In the case of Sudoku, there are exactly 4 entries per line in the matrix, regardless of size, so it has n^3 complexity.
Advantages
- Simple: There is no need to construct complex data structures, and all of the structures used in Python are provided.
- Readability: The first example above is directly transcribed from the Wikipedia example!
- Flexibility: Can be easily extended to solve Sudoku.
Solving Sudoku
What we need to do is to describe Sudoku as an accurate coverage problem. Here is the complete Sudoku code, which can handle any size, 3x3,5x5, even 2x3, all code less than 100 lines, and contains doctest! (thank Winfried Plappert and David Goodger for their comments and suggestions)