Algorithm for the longest common subsequence for small character sets

Source: Internet
Author: User

Generally, for two strings, the lengths are N and M, and their time complexity is O (nm).

However, for small character sets, the complexity can be reduced to O (n^2+km), where n is a shorter length of two strings. This method is much better than O (nm) for a large difference in the length of two strings.

It is assumed that all characters are lowercase letters, so that they conform to the premise of a small character set. Set a shorter string to S1, and a longer string of S2. The string subscript starts at 1.

S2 string The first character to the right of each position can be preprocessed by O (km). where k is the number of characters in a small character set, and M is the length of the longer string.

Use Next[i][j] to represent the position of the first (char) (' a ' +j) on the right of S2[i].

Set Dp[i][j] means that S1 matches the first I bit, and the longest common subsequence of length J matches S2 to the leftmost position. Length (S2) +1 if it does not exist.

Dp[i][0] = 0

If S2 's dp[i-1][j-1] has the same position as the first to s1[i] on the right, then dp[i][j] = min{dp[i-1][j], Next[dp[i-1][j-1]][s1[i]]}.

otherwise dp[i][j] = Dp[i-1][j].

For each DP that is not-1, record J, and finally take a maximum of the longest common subsequence.

Therefore, the total complexity is O (n^2+km).

Specific implementation:

#include <cstdio>#include <cstring>#include <algorithm>using namespaceStd;const int inf=0x3f3f3f3f; Const INT MAXN=1005; Const INT MAXM=1000005; charS1[maxn],s2[maxm];intDp[maxn][maxn];int next[maxm][26]; IntMain () {scanf ("%s%s", s1+1,s2+1), int l1=strlen (s1+1); int L2=strlen (s2+1); for (int i=0;i<maxm;i++) for (int j=0;j<26;j++) next[i][j]=l2+1, for (int. i=0;i<maxn;i++) for (int j=0;j<maxn;j++) dp[i] [J]=l2+1; for (int i=l2-1;i>=0;i--) {for (int j=0;j<26;j++) {char cc= ' a ' +J; if (S2[I+1]==CC) NEX T[i][j]=i+1, Else next[i][j]=next[i+1][j];} for (int i=1;i<=l1;i++) dp[i][0]=0, int ans=0; for ( int i=1;i<=l1;i++) {for (int j=1;j<=i;j++) {if (next[dp[i-1][j-1]][s1[i]-' a ']!=l2+1) dp[i][j]=min (dp[ i-1][j],next[dp[i-1][j-1]][s1[i]-' a ']); else dp[i][j]=dp[i-1][j]; if (dp[i][j]!=l2+1) ans=max (ANS,J);}} printf ("%d\n", ans); return 0;}  

Algorithm for the longest common subsequence for small character sets

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.