PHP and UTF-8

Source: Internet
Author: User
Tags mysql query php and mysql php script

There is no one-line solution. Care, attention to detail, and consistency.

The UTF-8 in PHP is awful. Forgive me for the words.

Currently, PHP does not support Unicode at low levels. There are several ways to ensure that the UTF-8 string can be handled correctly, but it is not easy and needs to go deep into all facets of Web applications, from HTML, to SQL, to PHP. We aim to provide a concise, practical overview.

PHP-Level UTF-8

A basic string operation, such as String two strings, assigning a string to a variable, does not require any special things for UTF-8. However, most string functions, such as Strpos () and strlen, require special consideration. These functions have a corresponding mb_* function: for example, Mb_strpos () and Mb_strlen (). These corresponding functions are collectively referred to as multibyte string functions. These multibyte string functions are specifically designed to manipulate Unicode strings.

When you manipulate Unicode strings, you must use mb_* functions. For example, if you use substr () to manipulate a UTF-8 string, the result is likely to contain some garbled characters. The correct function should be the corresponding multi-byte function, Mb_substr ().

The hard part is always remember to use the mb_* function. Even if you forget once, your Unicode string can be garbled in the next process.

Not all string functions have a corresponding one mb_* . If there is no one you want, then you can only think of yourself unlucky.

In addition, at the top of each PHP script (or at the top of the global include script) you should use the Mb_internal_encoding function, and if your script is output to the browser, then add a mb_http_output () function immediately thereafter. Explicitly defining the encoding of strings in each script will reduce a lot of headaches for you in the future.

Finally, many PHP functions that manipulate strings have an optional parameter that lets you specify character encodings. If this option is available, you should always explicitly indicate the UTF-8 encoding. For example, Htmlentities () has a character encoding option, and you should always specify UTF-8 when processing such a string.

MySQL-Level UTF-8

If your PHP script accesses MySQL, even if you follow the above considerations, your string may also be stored as a non-UTF-8 string in the database.

Make sure that the string from PHP to MySQL is UTF-8 encoded, that your database and data tables are set to the UTF8MB4 character set, and that you execute the MySQL query ' set names UTF8MB4 ' before executing any other queries in your database. This is of paramount importance. For an example, check the connection and query the MySQL database section.

Note that you must use the ' utf8mb4 ' character set to get full UTF-8 support, not the ' UTF8 ' character set! For reasons please see further reading.

Browser-level UTF-8

Use the Mb_http_output () function to ensure that your PHP script outputs UTF-8 strings to the browser. and a block of character set tags is included in the tag block of the HTML page <meta> .

Example
<?php//tell PHP that we ' re using UTF-8 strings until the end of the scriptmb_internal_encoding (' UTF-8 ');/tell PHP th At we'll be outputting UTF-8 to the Browsermb_http_output (' UTF-8 ');//Our UTF-8 Test string$string = ' Ašgaliu valgyti STI Kl? IR JIS Man?s Ne?eid?ia ';//Transform The string in some to with a multibyte function$string = Mb_substr ($string, 0, 10);  /Connect to a database to store the transformed string//see the PDO example in this document for more information//Note The ' Set names utf8mb4 ' commmand! $link = new \pdo (' mysql:host=your-hostname;dbname=your-db ', ' your- Username ', ' Your-password ', array (\pdo::attr_errmode \ = Pdo::errmode_exception, \pdo::attr_persistent = False, \pdo::mysql_attr_ Init_command = ' Set names utf8mb4 ');//Store our transformed string as UTF-8 in O ur database//assume Our DB and tables is in the UTF8MB4 character set and Collation$handle = $link->prepare (' Insert into sentences (Id, B Ody) VALUES (?,?) '); $handle->bindvalue (1, 1, PDO::P aram_int), $handle->bindvalue (2, $string); $handle->execute ();//Retrieve the String we just stored to prove it is stored correctly$handle = $link->prepare (' select * from sentences where Id =? ') ; $handle->bindvalue (1, 1, PDO::P aram_int); $handle->execute ();//Store The result into a object that we ll output LA ter in our Html$result = $handle->fetchall (\pdo::fetch_obj);? ><!doctype html>

Further reading
    • PHP Manual: Multibyte String functions
    • PHP UTF-8 Cheat Sheet
    • Stack Overflow: What causes PHP to be incompatible with Unicode?
    • Best practices for internationalizing strings between Stack overflow:php and MySQL
    • How to fully support Unicode in MySQL database

PHP and UTF-8

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.