Introduction to PHP (5) from scratch-string operations and POSIX regular expressions, posix from scratch
I. String operations
1. String formatting
1.1 remove Spaces
Trim ()The function removes spaces between the start and end positions of the string and returns the result string.
Ltrim ()The function can remove spaces at the start of the string.
Rtrim ()The function can remove spaces at the end of the string.
1.2 format the string for display
Nl2br ()The function uses a string as the input parameter and uses the <br/> mark in HTML to replace the line break in the string.
Printf ()The function outputs a formatted string to the browser.
Sprintf ()The function returns a formatted string.
When used in the type conversion codePrintf ()When using a function, you can use parameters with serial numbers. That is to say, the order of parameters is not necessarily the same as that in the conversion instructions.
printf(“Total aount of order is %2\$.2f(with shipping %1\.2f)”,$total_shipping,$total);
Strtoupper ()The function converts a string to uppercase.
Strtolower ()The function converts a string to lowercase.
Ucfirst ()The function converts the first character to uppercase (if it is a letter ).
Ucwords ()The function converts the first letter of each word in a string to uppercase.
1.3 format strings for storage
Some problems may occur when data is inserted into the database, because the database interprets these characters as controllers. These problematic characters are quotation marks (single quotation marks, double quotation marks), backslash, and NULL characters. To escape these characters, you can add a backslash before them. PHP provides two functions specifically used to escape strings.
PHP configuration will automatically add or remove the backslash. This function is composedMagic_quotes_gpcConfigure Command Control. The new version is enabled by default.
If enabled, you must callStripslashes ()Remove the backslash from the function; otherwise, useAddslashes ()Format them again, and all quotation marks will be added with a backslash.
2. Use string functions to connect and split strings
2.1 segmentation and combination
Explode ()The function splits the string into arrays.
explode(separator,string,limit)
Separator: Based on what Segmentation
String: string to be split
Limit: Maximum number of returned array elements (optional)
If the domain name is in both upper and lower case, this function cannot be used normally. This problem can be avoided by converting the domain name into uppercase or lowercase letters.
Implode ()The function combines array elements into a string.Join ()A function is its alias.
implode(separator,array)
Separator: content placed between array elements
Array: array to be combined as a string
2.2 continue Segmentation
Strtok ()The function Splits a string into smaller strings.
strtok(string,split)
String: Specifies the string to be split.
Split: Specifies one or more delimiter characters.
2.3 Interception
Substr ()The function returns a part of the string.
substr(string,start,length)
String: required. Specifies that a part of the string is to be returned.
Start: required. Specifies where the string starts.
Positive number-start at the specified position of the string
Negative number-starting from the specified position at the end of the string
0-Start at the first character in the string
Length: Optional. Specifies the length of the string to be returned. The default value is until the end of the string.
Positive number-return from the position of the start Parameter
Negative number-return from the end of the string
3. String comparison
Strcmp ()The function prototype is as follows:
int strcmp(string str1, string str2);
If the two strings are equal, the function returns 0. If the dictionary order str1 is behind str2 (greater than str2), a positive number is returned. If str1 is less than str2, a negative number is returned. This function is of different sizes.
Strcasecmp ()Except case-insensitive, the other is the same as strcmp.
Strnatcmp ()And the corresponding case-insensitiveStrnatcasecmp ()Strings are compared in "natural order.
Strlen ()The function can check the length of a string.
4. Use string functions to match and replace substrings
4.1 search for strings in strings
FunctionStrstr ()It can be used to search for matched strings or characters in a long string.
String strstr (source string, target substring );
If a exact match of the target keyword is found, the function returns the searched string before the target keyword. Otherwise, the returned value is false. If more than one target keyword exists, the returned string starts from the position where the first target keyword appears.
The strstr () function has two variants. One isStristr (), Not case sensitive. The other isStrrchr ()The string to be searched is returned from the beginning of the last occurrence of the target keyword.
4.2 locate the substring
FunctionStrpos ()Returns the position of the target keyword substring in the searched string. The strstr () function runs faster than the strstr () function.
int strpos(string haystack, string needle, int [ offset]);
The optional parameter offset of this function is used to specify the start position of the string to be searched.
FunctionStrrpos ()It is almost the same, but returns the position of the last occurrence of the target keyword substring in the searched string.
In any case, if the target keyword is not in the string, strpos () or strrpos () returns false. Therefore, this may bring new problems, because false is equal to 0 in a weak language such as PHP, that is, the first character of the string.
You can use the operator "=" to test the return value to avoid this problem:
$result = strpos($test, "H");if($result === false){ echo "Not found";}else{ echo "Found at position ".$result;}
4.3 Replace the substring
Str_replace ():
mixed str_replace(mixed needle, mixed new_needle, mixed haystack[, int & count]);
The four parameters correspond to each other in sequence: Atomic string, new substring, and source string. The number of replacement operations to be performed (optional ).
FunctionSubstr_replace ()It is used to find and replace the specified substring in the string at a given position.
string substr_replace(string string, string replacement, int start, int [length]);
This function uses the string replacement to replace a part of the string.
Ii. Regular Expressions
5. Introduction to Regular Expressions
5.1 Popular Science
A regular expression is a method used to describe a text format.
A regular expression matches another string at a certain position in a string. In addition to exact matching characters, you can also use special characters to specify the element of an expression.
Next we will introduce POSIX-style regular expressions.
5.2 character sets and Classes
Character sets can be used to match any character of the feature type. In fact, they are a wildcard.
You can use a character as a wildcard to replace any character other than the line break (\ n:
.at
If you want to limit that it is a character between a and z:
[a-z]at
Any content contained in square brackets ([]) is a character class (a character set that contains matching characters ). Note that the expression in square brackets only matches one character.
A collection can be listed:
[aeiou]
You can also describe a range:
[a-zA-Z]
You can also use a set to indicate that the character does not belong to a specific set:
[^a-z]
When the delimiters (^) are included in square brackets, it indicates no.
Many predefined character classes can also be used in regular expressions.
[[: Alnum:] [[: alpha:] letter character [[: lower:] lower case letter [[: upper:] capital letter [[: digit:] decimal [[: xdigit:] hexadecimal number [[: punct:] punctuation [[: blank:] tabs and spaces [[: space:] white space character [[: cntrl:] control character [[: print:] All printable characters [[: graph:] All printable characters except spaces
5.3 duplicates
The symbol "*" indicates that this mode can be repeated 0 times or more times. The symbol "+" indicates that this mode can be repeated once or more times:
[[:alnum:]]+
It must contain at least one letter ".
5.4 subexpressions
Parentheses can be used to indicate "at least one of these strings needs to be exactly matched ":
(very)*large
It can match with "large", "very large", and "very large.
5.5 subexpression count
You can use a number expression in curly braces to specify the number of times the content can be repeated. {3} indicates repeated 3 times, {2, 4} indicates repeated 2-4 times, {2,} indicates repeated at least two times:
(very){1,3}
Match "very", "very", and "very ".
5.6 locate the start or end of the string
The delimiters (^) are used to start a regular expression, indicating that the substring must appear at the beginning of the string to be searched. The character "$" is used at the end of the regular expression, it indicates that the substring must appear at the end of the string.
This mode will match strings that only contain one character from a to z:
^[a-z]$
5.7 Branch
You can use a vertical line in the regular expression to represent a choice.
com|edu|net
5.8 match special characters
To match the special characters mentioned above, such as the dot, braces, or dollar signs, you must add a backslash before them. If you want to match a backslash, use two backslash.
In PHP, the regular expression mode must be included in a single quotation mark string. Regular Expressions referenced by double quotation marks bring unnecessary complexity.
5.9 special characters
In POSIX regular expressions, special characters used outside square brackets (which need to be escaped outside square brackets ):
\ Escape Character ^ match at the beginning of the string $ match at the end of the string. match characters other than the linefeed (\ n) | select the start of the branch (the start of the Child Mode) submode end * repeated 0 or more times + repeated 1 or more times {start of the minimum/maximum number of marks} start of the minimum/maximum number of marks? Mark a sub-mode as optional
In POSIX regular expressions, special characters in square brackets are used (which must be escaped in square brackets ):
\ Escape Character ^ non, used only at the starting position-used to specify the character range
5.10 application in smart forms
The first purpose is to find specific terms in customer feedback. If you use a regular expression, you can match multiple regular expressions at the same time:
shop|customer service|retail
The second purpose is to verify the user's email address in the program:
^[a-zA-Z0-9_\-.]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-.]+$
For more information about the reason for adding a slash, see section 5.9.
6. use regular expressions to search for substrings
Searching for substrings is the main application of regular expressions. In PHP, the two functions that can be used to match POSIX regular expressions are:Ereg ()And eregi ().
int ereg(string pattern, string search, array [matches]);
This function is used to search for a string that matches the Regular Expression in pattern. If a string matches the child expression of pattern, the strings are stored in the array matches, and each array element corresponds to a subexpression.
FunctionEregi ()Except case-insensitive, the other is the same as ereg.
Verification Email:
If (! Eregi ('^ [a-zA-Z0-9 _ \-\.] + @ [a-zA-Z0-9 \-] + \. [a-zA-Z0-9 \-\.] + $ ', $ email) {echo "<p> email format error. </P> "; exit ;}
7. Replace the substring with a regular expression
Ereg_replace ():
string ereg_replace(string pattern, string replacement, string search);
This function searches for the string of the Regular Expression pattern in string search and replaces it with the string replacement.
FunctionEregi_replace ()Case Insensitive.
8. Use regular expressions to separate strings
Split ():
array split(string pattern, string search[, int max]);
The function divides the search string into substrings that conform to the regular expression pattern, and then returns the substrings to an array.
$address = “username@example.com”;$arr = split(“\.|@”,$address);while(list($key, $value) = each($arr)){ echo “<br />”.$value;}
Then an error is reported:
This problem is caused by the version:
For regex after PHP 5.3.0, we hope to use the PCRE specification. POSIX Regex is not recommended (Uniform Regex to avoid too many specifications ?).
Therefore, the following is a list of functions (POSIX) that are not recommended and recommended to be replaced with PCRE. For details, see PHP: Differences from POSIX regex.
* POSIX → PCRE
* Ereg_replace () → preg_replace ()
* Ereg () → preg_match ()
* Eregi_replace () → preg_replace ()
* Eregi () → preg_match ()
* Split () → preg_split ()
* Spliti () → preg_split ()
* SQL _regcase () → No equivalent
After changing to PCRE:
$address = "username@example.com";$arr = preg_split("/[\.|@]+/", $address);while(list($key, $value) = each($arr)){ echo "<br />".$value;}
This is a little tragic. POSIX Regex is introduced in the second half of this chapter. For more information about PCRE, see the online manual: http://www.php.net/pcre.
9. Summary
Generally, regular expression functions run less efficiently than string functions for the same function. If the application is simple enough, use a string expression.
However, it is not recommended to use multiple string functions for tasks that can be executed using a single regular expression.
Compiled from PHP and MySQL Web development version 4