Introduction to algorithms 15-3 editing distance

Source: Internet
Author: User
Topic Overview

There are six types of operations, namely copy, replace, delete (delete), insert, Swap (twiddle), Kill (Kill).
Take any combination of these six operations (which can be repeated or not) to get an action sequence.
The input to the action sequence is a string, and the output of the operation is another string.
For example, the string algorithm, after the following sequence of operations, get the string altruistic.

A string src, which becomes another string of DST, can be a lot of methods.
The topic requires that for a given SRC and DST, an optimal sequence of operations is found, containing the fewest number of operations and making SRC into DST. Output this sequence of operations. The number of operations in the action sequence is the edit distance from Src to DST.

The following details explain the meaning of these six operations, namely I and J:
I is the subscript for SRC, and J is the subscript for DST. Before the operation begins, i = j = 0.
Make X an arbitrary character
COPY:DST[J] = Src[i], i++, j + +
REPLACE:DST[J] = x, i++, J + +
delete:i++
INSERT:DST[J] = x, j + +
TWIDDLE:DST[J] = src[i+1], dst[j+1] = Src[i], i + = 2, j + = 2
Kill: Destroys the remaining characters in Src. If this operation is performed, it must be the most action. Thinking

For DP problems, the first thing to be clear is the initialization of the recursion. The increment of I and J in the topic gives a good hint.
Defines the editing distance for the process of c[i,j] as the string of SRC Src[0,i] that changes to the DST substring dst[0,j].
So C[src.length (), Dst.length ()] is the edit distance from Src to DST.
Here just want the least number of operations, in the case of the same number of operations, with what the operation is no difference, that is, the cost of each operation is the same, that is, costs (operation) = 1 (1) Initialization

C[0,0]=0
c[i,0] = i * Cost (delete)
c[0,j] = J * Cost (INSERT)
(2) Recursion

Make c[i, j] = A, according to the rules of each operation, has the following reasoning

COPY:DST[J] = Src[i], i++, j + + ==> c[i+1, j+1] = a + cost (copy)
replace:dst[j] = x, i++, j + +  ==> c[i+1, j+  1] = a + cost (replace)
delete:i++ ==>  c[i+1, j] = a + cost (delete) 
insert:dst[j] = x, j + +  ==> c[i, J+1] = a + cost (insert)
twiddle:dst[j] = src[i+1], dst[j+1] = Src[i], i + = 2, J + = 2  ==> c[i+2, j+2] = a + C OST (Twiddle)
kill: Destroys the remaining characters in Src. If this operation is performed, it must be the most action.
(3) Reverse push

The above reasoning is based on the results predicted in the following results.
And now the results are not going to determine the results, but now the results must come from the previous results.
Therefore, to convert the thinking direction, the recurrence of the reverse push, this step is the difficulty of DP.
(4) Final operation Kill

C[i][j] = min (c[m,n], min (C[i,n]+cost (Kill))), where 0<=i code

Source code DNA alignment issues

The problem of DNA alignment can be regarded as the problem of editing distance with weights.
In the editing distance problem, all operations are considered to be the same, i.e. cost (operation) = 1
Now, one of the operations plus the weights, the minimum value is changed to the maximum, is the problem of DNA alignment.

Copy:1
Replace:-1
Insert:-2
Delete: Remove
twiddle: Remove
Kill: Remove
Original Question




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.