PHP truncation string length (mixed Chinese and English strings) _ PHP Tutorial

Source: Internet
Author: User
PHP intercepts the string length (a mix of Chinese and English strings ). The article introduces the string truncation function from the built-in php truncation function to the final support of Chinese characters. the introduction of the mixed string truncation method in English and Chinese english. if you need it, refer to it. This document introduces the string truncation function from the built-in function of php to the support of Chinese characters at the end. it introduces the method of string truncation in English and Chinese. if you need it, refer to it.

Take some strings.

Syntax: string substr (string, int start, int [length]);

Return value: string

Function type: data processing

Description

This function extracts the start character of the string from the start character. If start is a negative number, it is counted from the end of the string. If the parameter length can be omitted, but it is a negative number, it indicates that the maximum length is obtained.

Example

The code is as follows:

Echo substr ("abcdef", 1, 3); // return "bcd"
Echo substr ("abcdef",-2); // return "ef"
Echo substr ("abcdef",-3, 1); // return "d"
Echo substr ("abcdef", 1,-1); // return "bcde"
?>

Only English is supported and Chinese is not supported.


Truncates GB2312 Chinese strings.

The code is as follows:

<? Php
// Truncate a Chinese string
Function mysubstr ($ str, $ start, $ len ){
$ Tmpstr = "";
$ Strlen = $ start + $ len;
For ($ I = 0; $ I <$ strlen; $ I ++ ){
If (ord (substr ($ str, $ I, 1)> 0xa0 ){
$ Tmpstr. = substr ($ str, $ I, 2 );
$ I ++;
} Else
$ Tmpstr. = substr ($ str, $ I, 1 );
}
Return $ tmpstr;
}
?>

Truncates UTF-8 encoded multi-byte strings.

The code is as follows:

<? Php
// Truncate the utf8 string
Function utf8Substr ($ str, $ from, $ len)
{
Return preg_replace ('# ^ (? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ from .'}'.
'((? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ len.'}). * # s ',
'$ 1', $ str );
}
?>

/*
* Function: Similar to substr, it does not cause garbled characters.
* Parameters:
* Return value:
*/

The code is as follows:

Function utf8_substr ($ str, $ start, $ length = null ){

// The screenshot is intercepted normally first.
$ Res = substr ($ str, $ start, $ length );
$ Strlen = strlen ($ str );

/* Determine whether 6 bytes at the beginning and end are complete (not incomplete )*/

// If the start parameter is a positive number
If ($ start> = 0 ){
// Cut about 6 bytes forward
$ Next_start = $ start + $ length; // initial position
$ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start;
$ Next_segm = substr ($ str, $ next_start, $ next_len );

// If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes.
$ Prev_start = $ start-6> 0? $ Start-6: 0;
$ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start );
}
// Start is a negative number.
Else {
// Cut about 6 bytes forward
$ Next_start = $ strlen + $ start + $ length; // initial position
$ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start;
$ Next_segm = substr ($ str, $ next_start, $ next_len );

// If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes.
$ Start = $ strlen + $ start;
$ Prev_start = $ start-6> 0? $ Start-6: 0;
$ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start );
}

// Determine whether the first 6 bytes comply with the utf8 rule
If (preg_match ('@ ^ ([x80-xBF] {}) [xC0-xFD]? @ ', $ Next_segm, $ bytes )){
If (! Empty ($ bytes [1]) {
$ Bytes = $ bytes [1];
$ Res. = $ bytes;
}
}

// Determine whether the last 6 bytes meet the utf8 rule
$ Ord0 = ord ($ res [0]);
If (128 <= $ ord0 & 191> = $ ord0 ){
// Truncate it later and add it to the front of res.
If (preg_match ('@ [xC0-xFD] [x80-xBF] {} $ @', $ prev_segm, $ bytes )){
If (! Empty ($ bytes [0]) {
$ Bytes = $ bytes [0];
$ Res = $ bytes. $ res;
}
}
}

Return $ res;
}

Test data ::

The code is as follows:
$ Str = 'dfjdjf test 13f test 65 & 2 data fdj (1 for mfe &...... ';
Var_dump (utf8_substr ($ str, 22, 12); echo'
';
Var_dump (utf8_substr ($ str, 22,-6); echo'
';
Var_dump (utf8_substr ($ str, 9, 12); echo'
';
Var_dump (utf8_substr ($ str, 19, 12); echo'
';
Var_dump (utf8_substr ($ str, 28,-6); echo'
';

Result: (no garbled characters are intercepted. You are welcome to test and submit a bug)
String (12) "fdj"
String (26) "fdj (1 is mfe &... "
String (13) "13f trial 65 & 2"
String (12) "Data fd"
String (20) "dj (1 is mfe &... "

Share frequently used ones

Next let's take a look at the Chinese truncation function.

The code is as follows:

Function MooCutstr ($ string, $ length, $ dot = '...'){
Global $ charset;

If (strlen ($ string) <= $ length ){
Return $ string;
}
$ String = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ string );
$ Strcut = '';
If (strtolower ($ charset) = 'utf-8 '){
$ N = $ tn = $ noc = 0;
While ($ n <strlen ($ string )){
$ T = ord ($ string [$ n]);
If ($ t = 9 | $ t = 10 | (32 <= $ t & $ t <= 126 )){
$ Tn = 1; $ n ++; $ noc ++;
} Elseif (194 <=$ t & $ t <= 223 ){
$ Tn = 2; $ n + = 2; $ noc + = 2;
} Elseif (224 <=$ t & $ t <239 ){
$ Tn = 3; $ n + = 3; $ noc + = 2;
} Elseif (240 <=$ t & $ t <= 247 ){
$ Tn = 4; $ n + = 4; $ noc + = 2;
} Elseif (248 <=$ t & $ t <= 251 ){
$ Tn = 5; $ n + = 5; $ noc + = 2;
} Elseif ($ t = 252 | $ t = 253 ){
$ Tn = 6; $ n + = 6; $ noc + = 2;
} Else {
$ N ++;
}
If ($ noc >=$ length ){
Break;
}
}
If ($ noc> $ length ){
$ N-= $ tn;
}
$ Strcut = substr ($ string, 0, $ n );
} Else {
For ($ I = 0; $ I <$ length; $ I ++ ){
$ Strcut. = ord ($ string [$ I]) & gt; 127? $ String [$ I]. $ string [++ $ I]: $ string [$ I];
}
}
// $ Strcut = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ strcut );

Return $ strcut. $ dot;
}

Bytes. Fetch part...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.