PHP truncation string length (mixed Chinese and English strings)

PHP truncation string length (mixed Chinese and English strings) _ PHP Tutorial

Last Update:2017-05-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

PHP intercepts the string length (a mix of Chinese and English strings ). The article introduces the string truncation function from the built-in php truncation function to the final support of Chinese characters. the introduction of the mixed string truncation method in English and Chinese english. if you need it, refer to it. This document introduces the string truncation function from the built-in function of php to the support of Chinese characters at the end. it introduces the method of string truncation in English and Chinese. if you need it, refer to it.

Take some strings.

Syntax: string substr (string, int start, int [length]);

Return value: string

Function type: data processing

Description

This function extracts the start character of the string from the start character. If start is a negative number, it is counted from the end of the string. If the parameter length can be omitted, but it is a negative number, it indicates that the maximum length is obtained.

Example

The code is as follows:
Echo substr ("abcdef", 1, 3); // return "bcd" Echo substr ("abcdef",-2); // return "ef" Echo substr ("abcdef",-3, 1); // return "d" Echo substr ("abcdef", 1,-1); // return "bcde" ?>

Only English is supported and Chinese is not supported.

Truncates GB2312 Chinese strings.

The code is as follows:
<? Php // Truncate a Chinese string Function mysubstr ($ str, $ start, $ len ){ $ Tmpstr = ""; $ Strlen = $ start + $ len; For ($ I = 0; $ I <$ strlen; $ I ++ ){ If (ord (substr ($ str, $ I, 1)> 0xa0 ){ $ Tmpstr. = substr ($ str, $ I, 2 ); $ I ++; } Else $ Tmpstr. = substr ($ str, $ I, 1 ); } Return $ tmpstr; } ?>

Truncates UTF-8 encoded multi-byte strings.

The code is as follows:
<? Php // Truncate the utf8 string Function utf8Substr ($ str, $ from, $ len) { Return preg_replace ('# ^ (? : [X00-x7F] \| [xC0-xFF] [x80-xBF] +) {0, '. $ from .'}'. '((? : [X00-x7F] \| [xC0-xFF] [x80-xBF] +) {0, '. $ len.'}). * # s ', '$ 1', $ str ); } ?>

/*
* Function: Similar to substr, it does not cause garbled characters.
* Parameters:
* Return value:
*/

The code is as follows:

Function utf8_substr ($ str, $ start, $ length = null ){

// The screenshot is intercepted normally first.
$ Res = substr ($ str, $ start, $ length );
$ Strlen = strlen ($ str );

/* Determine whether 6 bytes at the beginning and end are complete (not incomplete )*/

// If the start parameter is a positive number
If ($ start> = 0 ){
// Cut about 6 bytes forward
$ Next_start = $ start + $ length; // initial position
$ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start;
$ Next_segm = substr ($ str, $ next_start, $ next_len );

// If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes.
$ Prev_start = $ start-6> 0? $ Start-6: 0;
$ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start );
}
// Start is a negative number.
Else {
// Cut about 6 bytes forward
$ Next_start = $ strlen + $ start + $ length; // initial position
$ Next_len = $ next_start + 6 <= $ strlen? 6: $ strlen-$ next_start;
$ Next_segm = substr ($ str, $ next_start, $ next_len );

// If 1st bytes is not the first byte of the complete character, it is truncated to about 6 bytes.
$ Start = $ strlen + $ start;
$ Prev_start = $ start-6> 0? $ Start-6: 0;
$ Prev_segm = substr ($ str, $ prev_start, $ start-$ prev_start );
}

// Determine whether the first 6 bytes comply with the utf8 rule
If (preg_match ('@ ^ ([x80-xBF] {}) [xC0-xFD]? @ ', $ Next_segm, $ bytes )){
If (! Empty ($ bytes [1]) {
$ Bytes = $ bytes [1];
$ Res. = $ bytes;
}
}

// Determine whether the last 6 bytes meet the utf8 rule
$ Ord0 = ord ($ res [0]);
If (128 <= $ ord0 & 191> = $ ord0 ){
// Truncate it later and add it to the front of res.
If (preg_match ('@ [xC0-xFD] [x80-xBF] {} $ @', $ prev_segm, $ bytes )){
If (! Empty ($ bytes [0]) {
$ Bytes = $ bytes [0];
$ Res = $ bytes. $ res;
}
}
}

Return $ res;
}

Test data ::

The code is as follows:
$ Str = 'dfjdjf test 13f test 65 & 2 data fdj (1 for mfe &...... '; Var_dump (utf8_substr ($ str, 22, 12); echo' '; Var_dump (utf8_substr ($ str, 22,-6); echo' '; Var_dump (utf8_substr ($ str, 9, 12); echo' '; Var_dump (utf8_substr ($ str, 19, 12); echo' '; Var_dump (utf8_substr ($ str, 28,-6); echo' ';

Result: (no garbled characters are intercepted. You are welcome to test and submit a bug)
String (12) "fdj"
String (26) "fdj (1 is mfe &... "
String (13) "13f trial 65 & 2"
String (12) "Data fd"
String (20) "dj (1 is mfe &... "

Share frequently used ones

Next let's take a look at the Chinese truncation function.

The code is as follows:

Function MooCutstr ($ string, $ length, $ dot = '...'){
Global $ charset;

If (strlen ($ string) <= $ length ){
Return $ string;
}
$ String = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ string );
$ Strcut = '';
If (strtolower ($ charset) = 'utf-8 '){
$ N = $ tn = $ noc = 0;
While ($ n <strlen ($ string )){
$ T = ord ($ string [$ n]);
If ($ t = 9 | $ t = 10 | (32 <= $ t & $ t <= 126 )){
$ Tn = 1; $ n ++; $ noc ++;
} Elseif (194 <=$ t & $ t <= 223 ){
$ Tn = 2; $ n + = 2; $ noc + = 2;
} Elseif (224 <=$ t & $ t <239 ){
$ Tn = 3; $ n + = 3; $ noc + = 2;
} Elseif (240 <=$ t & $ t <= 247 ){
$ Tn = 4; $ n + = 4; $ noc + = 2;
} Elseif (248 <=$ t & $ t <= 251 ){
$ Tn = 5; $ n + = 5; $ noc + = 2;
} Elseif ($ t = 252 | $ t = 253 ){
$ Tn = 6; $ n + = 6; $ noc + = 2;
} Else {
$ N ++;
}
If ($ noc >=$ length ){
Break;
}
}
If ($ noc> $ length ){
$ N-= $ tn;
}
$ Strcut = substr ($ string, 0, $ n );
} Else {
For ($ I = 0; $ I <$ length; $ I ++ ){
$ Strcut. = ord ($ string [$ I]) & gt; 127? $ String [$ I]. $ string [++ $ I]: $ string [$ I];
}
}
// $ Strcut = str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '>'), $ strcut );

Return $ strcut. $ dot;
}

Bytes. Fetch part...

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PHP truncation string length (mixed Chinese and English strings) _ PHP Tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

PHP truncation string length (mixed Chinese and English strings) _ PHP Tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support