Topic Overview
There are six types of operations, namely copy, replace, delete (delete), insert, Swap (twiddle), Kill (Kill).
Take any combination of these six operations (which can be repeated or not) to get an action sequence.
The input to the action sequence is a string, and the output of the operation is another string.
For example, the string algorithm, after the following sequence of operations, get the string altruistic.
A string src, which becomes another string of DST, can be a lot of methods.
The topic requires that for a given SRC and DST, an optimal sequence of operations is found, containing the fewest number of operations and making SRC into DST. Output this sequence of operations. The number of operations in the action sequence is the edit distance from Src to DST.
The following details explain the meaning of these six operations, namely I and J:
I is the subscript for SRC, and J is the subscript for DST. Before the operation begins, i = j = 0.
Make X an arbitrary character
COPY:DST[J] = Src[i], i++, j + +
REPLACE:DST[J] = x, i++, J + +
delete:i++
INSERT:DST[J] = x, j + +
TWIDDLE:DST[J] = src[i+1], dst[j+1] = Src[i], i + = 2, j + = 2
Kill: Destroys the remaining characters in Src. If this operation is performed, it must be the most action. Thinking
For DP problems, the first thing to be clear is the initialization of the recursion. The increment of I and J in the topic gives a good hint.
Defines the editing distance for the process of c[i,j] as the string of SRC Src[0,i] that changes to the DST substring dst[0,j].
So C[src.length (), Dst.length ()] is the edit distance from Src to DST.
Here just want the least number of operations, in the case of the same number of operations, with what the operation is no difference, that is, the cost of each operation is the same, that is, costs (operation) = 1 (1) Initialization
C[0,0]=0
c[i,0] = i * Cost (delete)
c[0,j] = J * Cost (INSERT)
(2) Recursion
Make c[i, j] = A, according to the rules of each operation, has the following reasoning
COPY:DST[J] = Src[i], i++, j + + ==> c[i+1, j+1] = a + cost (copy)
replace:dst[j] = x, i++, j + + ==> c[i+1, j+ 1] = a + cost (replace)
delete:i++ ==> c[i+1, j] = a + cost (delete)
insert:dst[j] = x, j + + ==> c[i, J+1] = a + cost (insert)
twiddle:dst[j] = src[i+1], dst[j+1] = Src[i], i + = 2, J + = 2 ==> c[i+2, j+2] = a + C OST (Twiddle)
kill: Destroys the remaining characters in Src. If this operation is performed, it must be the most action.
(3) Reverse push
The above reasoning is based on the results predicted in the following results.
And now the results are not going to determine the results, but now the results must come from the previous results.
Therefore, to convert the thinking direction, the recurrence of the reverse push, this step is the difficulty of DP.
(4) Final operation Kill
C[i][j] = min (c[m,n], min (C[i,n]+cost (Kill))), where 0<=i code
Source code DNA alignment issues
The problem of DNA alignment can be regarded as the problem of editing distance with weights.
In the editing distance problem, all operations are considered to be the same, i.e. cost (operation) = 1
Now, one of the operations plus the weights, the minimum value is changed to the maximum, is the problem of DNA alignment.
Copy:1
Replace:-1
Insert:-2
Delete: Remove
twiddle: Remove
Kill: Remove
Original Question