Question mark intercepted by Chinese character strings in php

Source: Internet
Author: User
When using php built-in functions to intercept Chinese characters, you may encounter question marks. Below are some examples of Chinese characters accurately intercept. Php has only two problems with operating strings.

When using php built-in functions to intercept Chinese characters, you may encounter question marks. Below are some examples of Chinese characters accurately intercept.

The time for php to operate on strings is nothing more than two problems:

1. determine whether the string encoding is gbk or unicode.

2. extract the corresponding encoding.

In general, we may encounter garbled characters when using substr to intercept Chinese characters, because Chinese characters are dubyte characters. when a byte is intercepted, this Chinese character cannot be displayed and is out of order.

In fact, the solution is very simple. See the following screenshot function. the code is as follows:

  1. // Truncate an extra long string
  2. Function curtStr ($ str, $ len = 30 ){
  3. If (strlen ($ str)> $ len ){
  4. $ Str = substr ($ str, 0, $ len );
  5. $ Str. = chr (0 )."... ";
  6. Return $ str;
  7. }

The above chr (0) is not null

Null is nothing, while chr (0) is 0. The hexadecimal value is 0 × 00, and the binary value is 00000000.

Although chr (0) does not display anything, it is a character.

When a Chinese character is truncated, according to the encoding rules, he always needs to pull the other characters behind it as an explanation of the Chinese character. this is the cause of garbled characters. The combination of 0x81 to 0xff and 0x00 is always displayed as "null". Based on this feature, add a chr (0) after the result of substr ), this prevents garbled characters.

The following are some functions that can be used to precisely intercept Chinese strings. the utf8 encoded multi-byte string is truncated. the code is as follows:

  1. // Truncate the utf8 string
  2. Function utf8Substr ($ str, $ from, $ len)
  3. {
  4. Return preg_replace ('# ^ (? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ from .'}'.
  5. '((? : [X00-x7F] | [xC0-xFF] [x80-xBF] +) {0, '. $ len.'}). * # s ',
  6. '$ 1', $ str );
  7. }
  8. ?>

UTF-8, GB2312 support Chinese character truncation function, the code is as follows:

  1. /*
  2. Chinese character truncation functions supported by Utf-8 and gb2312
  3. Cut_str (string, truncation length, start length, encoding );
  4. The default encoding format is UTF-8.
  5. The default start length is 0.
  6. */
  7.  
  8. Function cut_str ($ string, $ sublen, $ start = 0, $ code = 'utf-8 ')
  9. {
  10. If ($ code = 'utf-8 ')
  11. {
  12. $ Pa = "/[x01-x7f] | [xc2-xdf] [x80-xbf] | xe0 [xa0-xbf] [x80-xbf] | [xe1-xef] [x80-xbf] [x80-xbf] | xf0 [x90-xbf] [x80-xbf]] [x80-xbf] | [xf1-xf7] [x80-xbf] [x80-xbf] [x80-xbf]/";
  13. Preg_match_all ($ pa, $ string, $ t_string );
  14.  
  15. If (count ($ t_string [0])-$ start> $ sublen) return join ('', array_slice ($ t_string [0], $ start, $ sublen )). "... ";
  16. Return join ('', array_slice ($ t_string [0], $ start, $ sublen ));
  17. }
  18. Else
  19. {
  20. $ Start = $ start * 2;
  21. $ Sublen = $ sublen * 2;
  22. $ Strlen = strlen ($ string );
  23. $ Tmpstr = '';
  24.  
  25. For ($ I = 0; $ I <$ strlen; $ I ++)
  26. {
  27. If ($ I >=$ start & $ I <($ start + $ sublen ))
  28. {
  29. If (ord (substr ($ string, $ I, 1)> 129)
  30. {
  31. $ Tmpstr. = substr ($ string, $ I, 2 );
  32. }
  33. Else
  34. {
  35. $ Tmpstr. = substr ($ string, $ I, 1 );
  36. }
  37. }
  38. If (ord (substr ($ string, $ I, 1)> 129) $ I ++;
  39. }
  40. If (strlen ($ tmpstr) <$ strlen) $ tmpstr. = "...";
  41. Return $ tmpstr;
  42. }
  43. }
  44.  
  45. $ Str = "the string to be intercepted by abcd ";
  46. Echo cut_str ($ str, 8, 0, 'gb2312 ');
  47. ?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.