The editing distance of the string

Source: Internet
Author: User

The similarity of a string is defined as the cost to convert a string to another string. Transformations can be inserted, deleted, and replaced by three editing methods, so the cost of conversion is the number of edits to the string.

As a comparison, two ways are used: recursive algorithm and dynamic programming algorithm

Simple recursive approach: simple recursive way is clear, very concise, but time complexity is very high

public static int Editdistance (string srcstr,string destsrc,int srcstart,int descstart) {//If any of the strings end, return if ( Srcstr.length ()-srcstart) ==0 | | (Destsrc.length ()-descstart) ==0) {return Math.Abs ((Srcstr.length ()-srcstart)-(Destsrc.length ()-descstart));} if (Srcstr.charat (Srcstart) ==destsrc.charat (Descstart)) {return editdistance (srcstr,destsrc,srcstart+1,descstart+ 1);} int edins=editdistance (srcstr,destsrc,srcstart,descstart+1) +1;//insert character int eddel=editdistance (SRCSTR,DESTSRC, Srcstart+1,descstart) +1;//Delete character int edrep=editdistance (srcstr,destsrc,srcstart+1,descstart+1) +1;//replace character int min1= Edins>eddel?eddel:edins;return min1>edrep?edrep:min1;}

Now consider the method of dynamic programming to improve. First, define the state of the problem, define the recurrence relationship of the phase and sub-problems from the state transition relationship.

Assuming that the problem is defined as the minimum number of edits (edit distance) required to solve the [1.....N] character of the source converted to TARGET[1.....M] characters, the child problem can be defined as converting source[1......i] characters to target[ 1......J] The minimum number of edits required for a character, which is the optimal substructure for this problem, so we define the state as the editing distance from the substring source[1......i] to the substring TARGET[1......J].

The simple recursive approach is very time-complex because a large number of States are repeated computations. Next, the concept of memos is introduced, and the values of each state are recorded in a two-dimensional table, and the table is prioritized in the recursive process.

/* Improved Recursive method with memo */public static int editdistancebetter (String srcstr,string destsrc,int srcstart,int Descstart, Tagmemorecord[][] trecords) {//First look up the table if (trecords[srcstart][descstart].refcount!=0) {Trecords[srcstart][descstart]. Refcount++;return trecords[srcstart][descstart].distance;} int distance=0;//If any one string ends, return if ((Srcstr.length ()-srcstart) ==0 | | (Destsrc.length ()-descstart) ==0) {Distance=math.abs ((srcstr.length ()-srcstart)-(Destsrc.length ()-descStart));} else if (Srcstr.charat (Srcstart) ==destsrc.charat (Descstart)) {distance=editdistance (srcstr,destsrc,srcstart+1, descstart+1);} else {int edins=editdistance (srcstr,destsrc,srcstart,descstart+1) +1;//insert character int eddel=editdistance (SRCSTR,DESTSRC, Srcstart+1,descstart) +1;//Delete character int edrep=editdistance (srcstr,destsrc,srcstart+1,descstart+1) +1;//replace character int min1= Edins>eddel?eddel:edins;distance= min1>edrep?edrep:min1;} Trecords[srcstart][descstart].distance=distance;trecords[srcstart][descstart].refcount=1;return distance;}

Class Tagmemorecord{int Distance;int refcount;public Tagmemorecord () {this.distance=-1;this.refcount=0;}}

The editing distance of the string

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.