Share a way to convert Unicode to UTF-8 in PHP

Source: Internet
Author: User
The following small series for everyone to bring an article with How PHP converts Unicode to UTF-8 (recommended)。 Small series feel very good, now share to everyone, also for everyone to make a reference. Let's take a look at it with a little knitting.

Examples are as follows:

function Unescape ($str) {  $str = Rawurldecode ($STR);  Preg_match_all ("/(?:%u.{4}) |. {4};|\d+;|.+/u ", $str, $r);  $ar = $r [0];  Print_r ($ar);  foreach ($ar as $k + $v) {    if (substr ($v, 0,2) = = "%u")      {$ar [$k] = Iconv ("Ucs-2be", "UTF-8", Pack ("H4", substr ($v ,-4))); }    ElseIf (substr ($v, 0,3) = = "") {      $ar [$k] = Iconv ("Ucs-2be", "UTF-8", Pack ("H4", substr ($v, 3,-1)));}    ElseIf (substr ($v, 0,2) = = "") {             $ar [$k] = Iconv ("Ucs-2be", "UTF-8", Pack ("n", substr ($v, 2,-1)));}  }  return join ("", $ar);} echo unescape ("Violet Star Blue");

Today there is user feedback, form system user submitted data Chinese will be garbled. The test found a problem with the iconv conversion.
Iconv (' UCS-2 ',
' GBK ',
' Chinese ')
Google

The search was discovered because the UCS-2 encoding on the Linux server was inconsistent with WINODWS.

So I changed it into

Iconv (' Ucs-2be ',
' GBK ',
' Chinese ')
Try it, Chinese is normal.

Here are about two platforms

UCS-2 Code of Unspoken rules:

1, UCS-2 is not equal to UTF-16. UTF-16 Each byte is encoded with an ASCII character range, while UCS-2 encodes each byte beyond the ASCII character range. UCS-2 and UTF-16 account for up to two bytes per character, but their encoding is not the same.

2, for UCS-2, the default is Ucs-2le under Windows. The Unicode of Ucs-2le is generated with MultiByteToWideChar (or a2w). Windows Notepad can save text as Ucs-2be, which is equivalent to a layer conversion.

3, for UCS-2, Linux under the default is Ucs-2be. use Iconv (Specify UCS-2) to convert the generated Unicode to UCS-2BE. If you convert the Windows platform over the UCS-2, you need to specify Ucs-2le.

4, because of Windows and Linux and other platforms on the UCS-2 understanding of different (UCS-2LE,UCS-2BE). MS advocates that Unicode has a boot flag (Ucs-2le FFFE, Ucs-2be FEFF) to indicate that the following characters are Unicode and discriminate Big-endian or Little-endian. So the data coming from the Windows platform is found to have this prefix, don't panic.

5, Linux encoded output, such as from the output of the file, from the printf output, need to do the appropriate code matching (if the encoding mismatch, general and the program compiled at the time of the encoding has a number of relations), and the console conversion input needs to view the current system encoding. For example, the current encoding of the console is UTF-8, then UTF-8 encoded things can be displayed correctly, GBK can not, similarly, the current code is GBK, you can display GBK encoding, and later the system should be more intelligent to handle more conversion. However, through the putty and other terminals still need to set a good terminal encoding conversion to remove garbled trouble.

The above this article uses PHP to convert Unicode to UTF-8 implementation (recommended) is a small part of the whole content to share to everyone, I hope to give you a reference, but also hope that we support PHP Chinese network.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.