Notes for excellent courseware

Source: Internet
Author: User
Chapter 4 string
4.1 Definition of string type
Representation and implementation of 4.2 strings
4.2.1 fixed-length sequential storage Representation
4.2.2 heap allocation storage Representation
4.2.3 blockchain storage representation of strings
4.3 string pattern matching algorithm
I. Strings and their basic concepts
String: A string is a finite sequence composed of zero or multiple characters.
Generally, it is recorded as "s ="
A1a2a3...
An"
S is the string name, which is enclosed by double quotation marks.
A sequence is a string value; A 1 Gbit/s (1 Gbit/s) can be a letter, number, or other character.
Length: the number of characters contained in a string is the length of the string.
Empty string: a zero-length string is called an empty string. It does not contain any
Character.
Space string: A string composed of only one or more spaces is called a space string (Blank
String ). Note: The difference between an empty string and a space string, for example,"
"
And ""
Separate tables
A space string with a length of 1 and an empty string with a length of 0.
4.1 Definition of string type
Sub-string and main string: the sub-sequence composed of any consecutive characters in the string is called the sub-string of the string.
String, which is called as the main string. Generally, the substring is first in the main string.
The sequence number in the main string corresponding to the first character of the substring when the substring appears. It is defined as a child
The position of the string in the main string.
For example, set a ="
This is a string"
B ="
Is"
Then B is the substring of a,
Main string. B appears twice in a. the first occurrence of the corresponding primary string
Set to 3. Therefore, B is called 3 in.
Equal: if and only when the two strings are of the same length, and the characters at each corresponding position
If they are all equal, the two strings are equal.
For example, a ="
Beijing"
B ="
Beijing"
, Then string a is equal to string B.
String constants and string variables: the string constants are the same as the Integer constants and real constants.
It can be referenced but cannot change its value, that is, it can only be read and not written. Value of the string variable
It can be changed in the program, that is, it can be read or written.
4.1 Definition of string type
Ii. Abstract Data Definition of strings
For the definition of abstract data types of strings, see p71.
III. Basic operations on strings
Many advanced Languages provide corresponding operations or
Standard library functions. The following describes only several string operations that are commonly used in C language.
Define the following variables:
Char S1 [20] ="
Dirtreeformat"
,
S2 [20] ="
File.txt"
;
Char S3 [30], * P;
Int result;
4.1 Definition of string type
1. Length)
Int strlen (char * s); // evaluate the length of a string
For example: printf ("
% D"
, Strlen (S1); Output 13
Copy)
Char * strcpy (char * To, char * From );
This function copies the string from to and returns an open
Pointer at the beginning.
Example: strcpy (S3, S1); // S3 ="
Dirtreeformat"
3. concatenation)
Char strcat (char * To, char * From)
This function copies the string from to the end of the string and returns a pointer to the string
Pointer at the beginning.
For example: strcat (S3,"
/"
)
Strcat (S3, S2); // S3 ="
Dirtreeformat/file.txt"
4.1 Definition of string type
4-string comparison (compare)
Int strcmp (char * S1, char * S2 );
This function compares the values of string S1 and string S2. When the return value is smaller than 0, it is equal to 0 or large.
When 0, S1 <S2, S1 = S2 or S1> S2
Example: result = strcmp ("
Baker"
,"
Baker"
); // Result> 0
Result = strcmp ("
12"
,"
12"
); // Result = 0
Result = strcmp ("
Jos"
,"
Joseph"
); // Result <0
5-character location (INDEX)
Char strchr (char * s, char C );
This function is used to locate the position where C appears in the string for the first time.
Location. Otherwise, null is returned.
For example, P = strchr (S2,"
."
); // P points to"
File"
Location after
If (p) strcpy (P,"
. Cpp"
); // S2 ="
File. cpp"
4.1 Definition of string type
2
Page 2
Example 1: substring
The substring process is the process of copying character sequences.
The characters starting with Len are copied to the T string.
Void substr (char * sub, char * s, int POs, int Len)
{If (Pos <0 | POS> strlen (S)-1 | Len <0)
Error ("
Parameter error"
);
Strncpy (sub, S + POs, Len );
}
Example 2: positioning index of a string (char * s, char * t, int POS)
Evaluate the position of string t after the POs character in the main string S. Method:
Start with a pos character, and compare the substring with T whose length is equal to T.
The function value is POS. Otherwise, the position of the substring increases by 1 and continues to be compared with t until it is found to be the same as T.
And so on.
4.1 Definition of string type
Int index (char * s, char * t, int POS)
{If (Pos> 0)
{N = strlen (s );
M = strlen (t );
I = Pos;
While (I <n-m + 1)
{Substr (sub, S, I, m );
If (strcmp (sub, T )! = 0)
++ I;
Else return (I );
}
}
Return (0 );
}
4.1 Definition of string type
A fixed-length sequence storage representation, also known as a compliant table for static storage allocation. It uses a group
Consecutive storage units are used to store character sequences in strings. The so-called fixed-length sequential storage structure,
Is defined directly using a fixed length character array, the upper bound of the array is given in advance:
# Define Max strlen 255
Typedef char sstring [maxstrlen]; // String Length of 0 parts
Performance and implementation of 4.2 strings
4.2.1 fixed-length sequential storage Representation
Status substring (sstring & sub, sstring S, int POs, int Len)
{If (Pos <1 | POS> S [0] | Len <0 | Len> S [0]-pos + 1)
Return Error;
Sub [1 .. Len] = s [POS .. POS + len-1];
Sub [0] = Len;
Retrun OK;
}
Status Concat (sstring & T, sstring S1, sstring S2)
{If (S1 [0] + S2 [0] <= maxstrlen)
{T [1 .. S1 [0] = S1 [1 .. S1 [0];
T [S1 [0] + 1 .. S1 [0] + S2 [0] = S2 [1 .. S2 [0];
T [0] = S1 [0] + S2 [0]; return true;
}
Else if (S1 [0] <maxstrlen)
{T [1 .. S1 [0] = S1 [1 .. S1 [0];
T [S1 [0] + 1 .. maxstrlen] = S2 [1 .. maxstrlen-S1 [0];
T [0] = maxstrlen; return false;
}
Else
{T [0 .. maxstrlen] = S1 [0 .. maxstrlen];
Return false;
}
}
Performance and implementation of 4.2 strings
Stores string-value character sequences in sequential storage units with a set of addresses.
The storage space is dynamically allocated during program execution, so it is also called dynamic storage.
The order of allocation. In C language, dynamic storage management functions such
The character array space needs to be dynamically allocated and released.
Typedef struct {
Char * Ch;
Int length;
} Hstring;
4.2.2 heap allocation storage Representation
Performance and implementation of 4.2 strings
Status strassign (hstring T, char * chars ){
// Generate a string T whose value is equal to the String constant chars
If (T. ch) Free (T. ch );
For (I = 0, c = chars; C; ++ I, ++ C); // evaluate the length of chars
If (! I) {T. Ch = NULL; T. Length = 0 ;}
Else
{If (! (T. Ch = (char *) malloc (I * sizeof (char ))))
Exit (overflow );
T. Ch [0 .. I-1] = chars [0 .. I-1];
T. Length = I;
}
}
Performance and implementation of 4.2 strings
3
Page 3
Int strlen (hstring s) {return S. length ;}
Status clearstring (hstring S)
{If (S. ch) {free (S. ch); S. CH = NULL ;}
S. Length = 0;
}
Int strcmp (hstring S, hstring T)
{For (I = 0; I <S. Length & I <t. length; ++ I)
If (S. ch [I]! = T. Ch [I])
Return (S. ch [I]-T. Ch [I]);
Return S. length-t.length;
}
Performance and implementation of 4.2 strings
Status Concat (hstring T, hstring S1, hstring S2)
{If (T. ch) Free (T. ch );
If (! (T. Ch = (char *) malloc (s1.length + s2.length) * sizeof (char ))))
Exit (overflow );
T. Ch [0 .. s1.length-1] = s1.ch [0 .. s1.length-1];
T. Length = s1.length + s2.length;
T. Ch [s1.length .. t. Length-1] = s2.ch [0 .. s2.length-1];
Return OK;
}
Performance and implementation of 4.2 strings
Status substr (hstring sub, hstring S, int POs, int Len)
{If (Pos <1 | POS> S. Length | Len <0 | Len> S. Length-pos + 1)
Return Error;
If (sub. ch) Free (sub. ch );
If (! Len) {sub. CH = NULL; sub. Length = 0 ;}
Else
{Sub. CH = (char *) malloc (LEN * sizeof (char ));
Sub. ch [0 .. len-1] = s [pos-1 .. POS + len-2];
S. Length = Len;
}
Return OK;
}
Performance and implementation of 4.2 strings
It is inconvenient to insert or delete a sequence string. You need to move a large number of words.
. Therefore, you can store string values in a single-chain table.
The structure is short for a chain string.
A chain string is uniquely identified by the header pointer. This structure facilitates insertion and
Delete operation, but the storage space utilization is too low.
A B c d e f g h I ###^
Head
Head
A B c I ^
4.2.3 blockchain storage representation of strings
Performance and implementation of 4.2 strings
4.3 string pattern matching algorithm
......
S
I-j + 1
S
I-j + 2
......
S
I-1
S
I
S
I + 1
......
P
1
P
2
......
P
J-1
P
J
......
...
Mismatch location
P
1
P
2
......
P
J-1
P
J
......
Next match location
E. g: S = a B c a B c d
P = a B c a B c d
A B c a B c d
Mode shifts one digit to the right
4.3.1 positioning function index () for substring position ()
Pattern Match: locate the substring (pattern) in the primary string.
Basic Method: Compare the main string and Pattern Characters one by one from the specified position.
If the characters do not match, the entire mode is relative to the original bit.
To the right. As shown in:
The most basic pattern matching program:
Int index (sstring S, sstring T, int POS)
// Search for the matching position of the mode t after the POs character of the Main string s
{I = Pos; j = 1;
While (I <= s [0] & J <= T [0]) // storage String Length of unit 0
{If (s [I] = T [J]) {++ I; ++ J ;}
Else {I = I-j + 2; j = 1 ;}
}
If (j> T [0]) return I-t [0] // the starting position of T matching in S
Else return 0;
}
4.3 string pattern matching algorithm
Demo
4
Page 4
S =
Aaaaaaaaab
P =
AAB
A
, B
Not Supported
Mode shifts one digit to the right
S =
Aaaaaaaaab
P =
AAB
A
, B
Not Supported
Mode shifts one digit to the right
S =
Aaaaaaaaab
P =
AAB
A
, B
Not Supported
Mode shifts one digit to the right
S =
Aaaaaaaaab
P =
AAB
A
, B
Not Supported
Mode shifts one digit to the right
S =
Aaaaaaaaab
P =
AAB
S =
Aaaaaaaaab
P =
AAB
S =
Aaaaaaaaab
P =
AAB
S =
Aaaaaaaaab
P =
AAB
A
, B
Not Supported
Mode shifts one digit to the right
A
, B
Not Supported
Mode shifts one digit to the right
A
, B
Not Supported
Mode shifts one digit to the right
Exact match
N-m + 1
Matching start position
4.3 string pattern matching algorithm
4.3.2 knuth-Morris-Pratt
Pattern matching algorithm (KMP)
Algorithm)
It indicates that the time complexity of the basic pattern matching algorithm is
O
(
N
*
M
).
Example (n
Master String Length, m
Is the mode length ). Per comparison m
Time, move mode once. Last
In
N-m + 1
Locate and compare the master string
(N-m + 1
)*
M
Times
.
S =
Abcabcabcd
P =
Abcabcd
S =
Abcabcabcd
P =
Abcabcd
S =
Abcabcabcd
P =
Abcabcd
S =
Abcabcabcd
P =
Abcabcd
Mismatch point
Shifts one digit to the right. The value is not matched, and the value is shifted to the right.
Move one digit to the right, three times
After comparison, proceed
Comparison of breakpoints
Relatively successful!
Problem:
Can I save the above five comparisons?
Relatively, directly
S7
And
P4
Of
What is the comparison between them?
Is this comparison omitted? Is this comparison omitted?
Three comparisons are omitted?
4.3 string pattern matching algorithm
Description
KMP
Example of an algorithm:
I
J
S =
Abcabcabcd
P =
Abcabcd
S =
Abcabcabcd
P =
Abcabcd
Mismatch point
Search for new matching locations directly
I
J
...
S
I-j + 1
S
I-j + 2
......
S
I-1
S
I
S
I + 1
......
P
1
P
2
......
P
J-1
P
J
......
...
Mismatch point

Analysis: When
S
I
And
P
J
In case of mismatch,
S
I-j + 1
S
I-j + 2
......
S
I-1
=
P
1
P
2
......
P
J-1
...
S
I-K
S
I-k + 1
......
S
I-1
S
I
S
I + 1
......
P
1
......
P
K-1
P
K
......
...
Mismatch point
If:
P
1
P
2
.........
P
K-1
= P
J-k + 1
P
J-K + 2
.........
P
J-1
;
S can be compared directly.
I
And
P
K
4.3 string pattern matching algorithm
Int
Index_kmp (sstring S, sstring T, int
Pos)
// Search for the matching position of the mode t after the POs character of the Main string s
// The value of the next function is known, t is not null, 1 <= POS <= strlength (s)
{I = Pos; j = 1;
While
(I <= s [0] &
J <= T [0])
{If
(J = 0 | s [I] = T [J]) {++ I; ++ J ;}
Else
J = next [J];
}
If
(J> T [0]) Return
I-t [0] // the starting position of T in S
Else return
0;
} // Index_kmp

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.