Suffix array-substrings that appear the most consecutively

Source: Internet
Author: User

Http://blog.csdn.net/ysu108/article/details/7795479

 

Returns the substring that appears the most frequently in a string. For example, the string "abababc" can appear at most AB consecutively, three times in a row. To distinguish it from the longest duplicate substring in a string, or the above string, the longest duplicate substring is Abab. The solutions for the two questions are somewhat similar, and the data structure of the suffix array is used. Find the substring that appears the most frequently in a string. First, generate a suffix array. For example, the above string is:
Abababc
Bababc
Ababc
Babc
ABC
BC
C
We can see that the first suffix array and the third suffix array start with AB, and the 5th suffix array is also AB. It can be seen that, if a string s appears before the suffix array I for the first time, if it repeats, the next appearance should be in the I + Len (s) suffix array. This rule is not hard to see. Then it is not difficult to search for results from start to end based on this rule. The following code is used:

# Include <iostream> using namespace STD; int con_sub (char * STR, char ** RET); int main () {char STR [] = "abcabcabcabcabcabbbb "; char * ret = NULL; int time = con_sub (STR, & RET); printf ("% s occuers % d times \ n", RET, time); Return 0 ;} int con_sub (char * STR, char ** RET) {int max_time = 0; // The maximum number of consecutive occurrences int ret_len = 0; // The length of the continuously appearing string char * ADDR = NULL; // The starting address of the continuously appearing string int Len = strlen (STR); char ** A = (char **) malloc (sizeof (char *) * Len); // generates the suffix array for (INT I = 0; I <Len; I ++) A [I] = & STR [I]; // The length of the repeated string ranges from 1 to (LEN + 1)/2 for (INT I = 1; I <= (LEN + 1)/2; I ++) {// when the repeated string length is I, if it appears consecutively, then the suffixes J and J + I are repeated strings for (Int J = 0; j + I <= len-1; j + = I) {// J indicates the starting point int K = J; int temp_time = 1; while (K + I <= len-1 & strncmp (A [K], a [K + I], i) = 0) {temp_time ++; k + = I;} If (temp_time> max_time) {max_time = temp_time; ret_len = I; ADDR = A [k] ;}}* ret = new char [Len + 1]; strncpy (* ret, ADDR, ret_len); Return max_time ;}

Suffix array-substrings that appear the most consecutively

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.