PHP intercepts Chinese strings and intercepts 100 characters at the first occurrence of a given string

Source: Internet
Author: User
PHP intercepts Chinese strings and intercepts them at the first occurrence of a given string, capturing 100 characters.
Title, using the following two ways to intercept, found that the results are not correct, please point out.
Where $word is the string that will be intercepted, $key _word for the given substring
Method One:
PHP Code
  
   Mb_substr ($word, Strpos ($word, $key _word)/3,100, ' utf-8 ');


Method Two:
PHP Code
  
   $start _key = Mb_strpos ($word, $key _word); $start _key = $start _key>0? $start _key:0;mb_substr ($word, $start _key,100, ' Utf-8 ');


------Solution--------------------
I found a very useful function, Mb_strimwidth ($str, 0, +, ' ', ' UTF8 '), an ' an ' character width intercept
------Solution--------------------
I really sweat, do not understand the code of the people who write out of the codes really let people have egg pain, all understand.

Remember, strstr/strpos these are for ASCII strings, that is, 1 bytes 1 byte pair, do not care about coding, for Gbk/utf8, under certain circumstances can also work normally, because Gbk/utf8 non-ASCII character of the single byte is the 7th bit 1, However, the GBK code is prone to problems because the two 2-byte characters of 1 bytes may cause an incorrect match.

The MB is the encoded function, so the number passed to him and the numbers it returns are the number of characters, not the number of bytes.

You see your first code with Strpos, if the UTF8 code is OK, the other is not to tell the truth. UTF8, you also assume that the characters are 3 bytes ... That's a mistake.

The second code is more reliable, but unfortunately mb_strpos you did not tell it encoding, this is not finished.




------Solution--------------------
mb_string function groups are not so use

Mb_internal_encoding ("Utf-8");
Mb_substr ($word, Mb_strpos ($word, $key _word), 100);
------Solution--------------------
PHP Code
String interception, all character lengths are 1,GBK, utf-8 generic.  function Cut ($str, $len = n, $dot = ' ... ') {    if (Mb_strlen ($str, "Utf-8") <= ($len + 1)) {        $str = $str;    } else {        $str = mb_substr ($str, 0, $len, "Utf-8"). $dot;    }    
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.