PHP intercepts the string length (a mix of Chinese and English strings ). The article introduces the string truncation function from the built-in php truncation function to the final support of Chinese characters. the introduction of the mixed string truncation method in English and Chinese english. if you need it, refer to it. This document introduces the string truncation function from the built-in function of php to the support of Chinese characters at the end. it introduces the method of string truncation in English and Chinese. if you need it, refer to it.
Take some strings.
Syntax: string substr (string, int start, int [length]);
Return value: string
Function type: data processing
Description
This function extracts the start character of the string from the start character. If start is a negative number, it is counted from the end of the string. If the parameter length can be omitted, but it is a negative number, it indicates that the maximum length is obtained.
Example
The code is as follows: |
|
Echo substr ("abcdef", 1, 3); // return "bcd" Echo substr ("abcdef",-2); // return "ef" Echo substr ("abcdef",-3, 1); // return "d" Echo substr ("abcdef", 1,-1); // return "bcde" ?> |
Only English is supported and Chinese is not supported.
Truncates GB2312 Chinese strings.
The code is as follows: |
|
<? Php // Truncate a Chinese string Function mysubstr ($ str, $ start, $ len ){ $ Tmpstr = ""; $ Strlen = $ start + $ len; For ($ I = 0; $ I <$ strlen; $ I ++ ){ If (ord (substr ($ str, $ I, 1)> 0xa0 ){ $ Tmpstr. = substr ($ str, $ I, 2 ); $ I ++; } Else $ Tmpstr. = substr ($ str, $ I, 1 ); } Return $ tmpstr; } ?> |
Truncates UTF-8 encoded multi-byte strings.
The code is as follows: |
|
<? Php // Truncate the utf8 string Function utf8Substr ($ str, $ from, $ len) { Return preg_replace ('# ^ (? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ from .'}'. '((? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ len.'}). * # s ', '$ 1', $ str ); } ?> |
/*
* Function: Similar to substr, it does not cause garbled characters.
* Parameters:
* Return value:
*/
The code is as follows: |
|
Function utf8_substr ($ str, $ start, $ length = null ){ // The screenshot is intercepted normally first. $ Res = substr ($ str, $ start, $ length ); $ Strlen = strlen ($ str ); /* Determine whether 6 bytes at the beginning and end are complete (not incomplete )*/ // If the start parameter is a positive number If ($ start> = 0 ){ // Cut about 6 bytes forward $ Next_start = $ start + $ length; // initial position $ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start; $ Next_segm = substr ($ str, $ next_start, $ next_len ); // If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes. $ Prev_start = $ start-6> 0? $ Start-6: 0; $ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start ); } // Start is a negative number. Else { // Cut about 6 bytes forward $ Next_start = $ strlen + $ start + $ length; // initial position $ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start; $ Next_segm = substr ($ str, $ next_start, $ next_len ); // If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes. $ Start = $ strlen + $ start; $ Prev_start = $ start-6> 0? $ Start-6: 0; $ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start ); } // Determine whether the first 6 bytes comply with the utf8 rule If (preg_match ('@ ^ ([x80-xBF] {}) [xC0-xFD]? @ ', $ Next_segm, $ bytes )){ If (! Empty ($ bytes [1]) { $ Bytes = $ bytes [1]; $ Res. = $ bytes; } } // Determine whether the last 6 bytes meet the utf8 rule $ Ord0 = ord ($ res [0]); If (128 <= $ ord0 & 191> = $ ord0 ){ // Truncate it later and add it to the front of res. If (preg_match ('@ [xC0-xFD] [x80-xBF] {} $ @', $ prev_segm, $ bytes )){ If (! Empty ($ bytes [0]) { $ Bytes = $ bytes [0]; $ Res = $ bytes. $ res; } } } Return $ res; } |
Test data ::
The code is as follows: |
|
$ Str = 'dfjdjf test 13f test 65 & 2 data fdj (1 for mfe &...... '; Var_dump (utf8_substr ($ str, 22, 12); echo' '; Var_dump (utf8_substr ($ str, 22,-6); echo' '; Var_dump (utf8_substr ($ str, 9, 12); echo' '; Var_dump (utf8_substr ($ str, 19, 12); echo' '; Var_dump (utf8_substr ($ str, 28,-6); echo' '; |
Result: (no garbled characters are intercepted. You are welcome to test and submit a bug)
String (12) "fdj"
String (26) "fdj (1 is mfe &... "
String (13) "13f trial 65 & 2"
String (12) "Data fd"
String (20) "dj (1 is mfe &... "
Share frequently used ones
Next let's take a look at the Chinese truncation function.
The code is as follows: |
|
Function MooCutstr ($ string, $ length, $ dot = '...'){ Global $ charset; If (strlen ($ string) <= $ length ){ Return $ string; } $ String = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ string ); $ Strcut = ''; If (strtolower ($ charset) = 'utf-8 '){ $ N = $ tn = $ noc = 0; While ($ n <strlen ($ string )){ $ T = ord ($ string [$ n]); If ($ t = 9 | $ t = 10 | (32 <= $ t & $ t <= 126 )){ $ Tn = 1; $ n ++; $ noc ++; } Elseif (194 <=$ t & $ t <= 223 ){ $ Tn = 2; $ n + = 2; $ noc + = 2; } Elseif (224 <=$ t & $ t <239 ){ $ Tn = 3; $ n + = 3; $ noc + = 2; } Elseif (240 <=$ t & $ t <= 247 ){ $ Tn = 4; $ n + = 4; $ noc + = 2; } Elseif (248 <=$ t & $ t <= 251 ){ $ Tn = 5; $ n + = 5; $ noc + = 2; } Elseif ($ t = 252 | $ t = 253 ){ $ Tn = 6; $ n + = 6; $ noc + = 2; } Else { $ N ++; } If ($ noc >=$ length ){ Break; } } If ($ noc> $ length ){ $ N-= $ tn; } $ Strcut = substr ($ string, 0, $ n ); } Else { For ($ I = 0; $ I <$ length; $ I ++ ){ $ Strcut. = ord ($ string [$ I]) & gt; 127? $ String [$ I]. $ string [++ $ I]: $ string [$ I]; } } // $ Strcut = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ strcut ); Return $ strcut. $ dot; } |
Bytes. Fetch part...