Smarty Chinese garbled solution

Source: Internet
Author: User
Tags mysql client mysql code mysql in readfile dreamweaver smarty template ultraedit

Today excitedly put all the Web page and script to the server, look at the results of the 2 weeks, the results opened the page dumbfounded, all is a question mark, was garbled.

The whole debugging process is divided into two parts, first, is not the problem of the server.

Because the server is based on Linux CentOS, now the situation is much like a virtual host, there are many sites by folder classification storage. The other files on the top are gb2312 coded, but my web page is utf-8, so I just started to think is not the problem of the server, so put all the files uploaded to my other host, it really shows no problem, immediately feel "enlightened", is "server problem" ...

But the problem has to be solved after all, because eventually it must be uploaded to the server.

So carefully studied the next garbled page, found that there is the art of garbled, some Chinese display, and some did not, careful observation found that the garbled are after Smarty analysis, the first instinct, the problem in the Smarty.

In fact, I used to have contact with garbled, said garbled, the first thought is coding.

The problem of inconsistent coding is one of the main causes of garbled characters.

For example, I have a Web page file, open with Notepad, and then save as, and then encode the "UTF-8", in the head of the page tag in the META also to declare the line:

<meta http-equiv= "Content-type" content= "text/html; Charset=utf-8 "/>

As for the database, it will be more consistent coding, coding inconsistent with the set names Utf-8 or Iconv and other functions to solve.

First, add this sentence to the PHP file:

Header ("content-type:text/html; Charset=gb2312″),

Add where. This is really a problem, because there is a display of the Smarty template parsing, I added 4 separate places, but all declared invalid.

Save All Smarty php files in the utf-8 format again. (is not very collapse.) ) The result was declared invalid.

index.php, INDEX.TPL (corresponding template file), config.php (my corresponding profile) are saved as Utf-8 format, ANSI format, and Unicode format test, all declared invalid.

It's a bad idea to change the META tag's charset to gb2312, because the design was designed to Utf-8, global, but maybe it was wrong and declared ineffective.

Helpless when, in the open config.php, save as ANSI do test, invalid, and then save as UTF-8 format, the results of garbled solution.

Cup with me ...

Then I estimate that the Linux system can identify the text file encoding and some of the different parts of Windows led to the above garbled problem. Because I config file is utf-8, and then save the ANSI format, and then save the utf-8 format, just fine ... Explain the first time is not my hand, utf-8 did not get thorough it.

Appendix 1: I almost went down to UltraEdit32 to save the utf-8 format for Dom ...

Smarty Template default UTF-8 encoding, UTF-8 code has BOM and no BOM two kinds of situation, the general situation to save Shime think there is a BOM, resulting in the Smarty template garbled problem. FF can automatically filter the BOM, but IE does not.

At present, the best way is to save the file automatically saved into the UTF-8 format without the BOM, the more useful editor such as UltraEdit, save the control of its format as "Utf-8 no BOM" can.

The following is the online reprint of the interpretation of the BOM:

The BOM (Byte order mark) is the standard mark used in the UTF coding scheme to identify the code, and in UTF-16 it was FF FE, which became UTF-8 and became the EF BB BF. This tag is optional because UTF8 bytes have no order, so it can be used to detect whether a byte stream is UTF-8 encoded. Microsoft does this kind of testing, but some software does not do this kind of testing and treats it as a normal character.

Microsoft in its own UTF-8 format text file before adding the EF BB bf Three bytes, Windows above the Notepad and other programs is based on these three bytes to determine whether a text file is ASCII or UTF-8, but this is only Microsoft secretly mark, The UTF-8 text file is not marked as such on other platforms.

That is to say, a UTF-8 file may have a BOM, or there may be no BOM, so how to distinguish it. Three different methods. 1, open the file with UltraEdit-32, switch to hexadecimal edit mode, see if the file head has EF BB BF. 2, open with Dreamweaver, view page properties, see "including Unicode signature BOM" before there is a tick. 3, with Windows Notepad Open, choose "Save as", see the file's default encoding is UTF-8 or ANSI, if it is ANSI without BOM.

Note When converting a gb2312 file to a UTF-8 file with Convertz, the default setting is no BOM. No BOM may appear the above garbled problem, but with the BOM, for PHP include files to be careful, will be in the PHP word throttle in front of more than the EF BB BF, the output to the monitor in advance may bring procedural errors. One solution is that all files that are include are saved as ANSI, and the main file can be UTF-8. To remove a BOM from a file, open it with ulteredit, switch to hexadecimal edit mode, replace the first three bytes (that is, the damn EF BB BF) with 20, save (note the ability to turn off automatic backup when you save), and then switch to the default edit mode. Just remove the first three spaces and get rid of them.

There are also some small coding knowledge: the so-called Unicode saved files are actually utf-16, just the same as Unicode code, but in the concept of Unicode and UTF are two different things, Unicode is a memory encoding expression scheme, And UTF is a scheme for how to save and transmit Unicode. Utf-16 is also high in the front (LE) and high in the back (BE) two kinds. The official UTF code also has utf-32, also divided Le and be. Non-Unicode official UTF code also has utf-7, mainly for message transfer. The single-byte portion of the UTF-8 is compatible with Iso-8859-1, which is largely forced out of the old system and library functions that are not properly handled by the utf-16, and also for English characters, saving the file space (at the expense of space wasted by non-English characters). In Iso-8859-1, UTF8 and iso-8859-1 are represented in one byte, and utf-8 use two or three bytes when representing other characters.

Appendix 2: A more comprehensive approach to popular solutions

PHP Chinese garbled is one of the common problems in PHP development. PHP Chinese garbled sometimes occurs in the Web page itself, some generated in the process of MySQL interaction, and sometimes related to the operating system. Here's a summary.

First, the code for PHP pages.

1. php file itself and the encoding of the Web page should match

A. If you want to use gb2312 encoding, then PHP to output headers: header ("content-type:text/html; Charset=gb2312″), static page add, all files encoded in the format of ANSI, available Notepad open, Save as a selection encoding for ANSI, overwriting the source file.

B. If you want to use UTF-8 encoding, then PHP to output headers: header ("content-type:text/html; Charset=utf-8″), static page additions, all files are encoded in the format of Utf-8. Save As Utf-8 may be a bit of trouble, general Utf-8 file at the beginning will have a BOM, if the session will be a problem, you can use EditPlus to save, in EditPlus, tool-> parameter selection-> file->utf-8 signature, select the total is deleted, then save the BOM information can be removed.

2. PHP itself is not Unicode, all functions such as substr have to be changed to MB_SUBSTR (need to install mbstring extension), or iconv transcoding.

Two. PHP Data interaction with MySQL

PHP and database coding should be consistent

1. Modify the MySQL configuration file My.ini or My.cnf,mysql best use UTF8 encoding

[MySQL]

Default-character-set=utf8

[Mysqld]

Default-character-set=utf8

Default-storage-engine=myisam

By adding under [Mysqld]:

Default-collation=utf8_bin

init_connect= ' SET NAMES utf8′

2. In the need to do database operation of the PHP program before adding mysql_query ("Set Names ' code ')", Encoding and PHP code consistent, if the PHP code is gb2312 that MySQL code is gb2312, if it is utf-8 that MySQL code is UTF8, this will not appear garbled when inserting or retrieving data

Three. PHP is related to the operating system

Windows and Linux are not encoded in the same way, in Windows environment, when invoking PHP functions when the parameters are Utf-8 encoding error, such as Move_uploaded_file (), FileSize (), ReadFile (), etc. These functions are often used when processing uploads and downloads, and the following errors may appear when invoked:

Warning:move_uploaded_file () [function.move-uploaded-file]:failed to open stream:invalid argument.

Warning:move_uploaded_file () [Function.move-uploaded-file]:unable to move ' to ' in ...

Warning:filesize () [Function.filesize]: Stat failed for ... in ...

Warning:readfile () [Function.readfile]: Failed to open stream:invalid argument in..

In the Linux environment with GB2312 encoding although these errors will not appear, but the saved file name will not be able to read the file, then the parameters can be converted to the operating system to identify the code, encoding conversion can be used mb_convert_encoding (strings, new code, the original code) or Iconv (original code, new Code, string), so that after processing the saved file name will not appear garbled, you can read the file normally, to achieve the Chinese name file upload, download.

In fact, there are better solutions, completely disconnected from the system, you do not have to consider the system is what the code. You can generate a sequence of letters and numbers as a file name, and the original with Chinese name in the database, so call Move_uploaded_file () will not have problems, download the file name only to the original with the Chinese name. The code to implement the download is as follows

Header ("Pragma:public");

Header ("Expires:0″");

Header ("Cache-component:must-revalidate, post-check=0, Pre-check=0″);"

Header ("Content-type: $file _type");

Header ("Content-length: $file _size");

Header ("content-disposition:attachment; filename=/"$file _name/");

Header ("Content-transfer-encoding:binary");

ReadFile ($file _path);

$file _type is the type of file, $file _name is the original name, $file _path is the address of the file that is saved on the service.

Four. Again to summarize why it is garbled

In general, the occurrence of garbled characters have 2 kinds of reasons, the first is because the encoding (charset) set error, causing the browser to parse with the wrong encoding, resulting in a full screen messy "heavenly book", followed by the file is the wrong code to open, and then save, such as a text file was originally GB2312 encoded, It is opened and saved with UTF-8 encoding. To solve the above garbled problem, first need to know which links in the development of the Code:

1, file encoding: Refers to the paging file (. html,.php, etc.) itself is the type of code to save. Notepad and Dreamweaver automatically recognize the file encoding when they open the page, so there is no problem. But Zendstudio will not automatically recognize the code, it will only be fixed according to the preferences of the configuration of a code to open the file, if the work is not noticed, with the wrong code to open the file, made a change after a save, garbled appeared (I have a deep experience).

2, page declaration code: In the HTML code head inside, can be used to tell the browser what the Web page code, the current Chinese web site in the development of XXX is mainly used in GB2312 and UTF-8 two kinds of coding.

3, database Connection code: Refers to the database operation when the encoding and database transmission data, here need to pay attention to the database itself is not confused with the coding, such as MySQL internal default is latin1 encoding, that is, MySQL is latin1 encoded to store data, Data transmitted to MySQL in other encodings is converted to latin1 encoding.

Know where the web development involved in coding, also know the reason for the garbled code: the above 3 coding settings are inconsistent, because most of the encoding is compatible with ASCII, so the English symbol will not appear, Chinese on the bad luck.

Five. The decisive battle some common error condition and solves:

1, the database uses UTF8 code, and the page affirms the code is GB2312, this is the most common generation of garbled reason. At this time in the PHP script directly select data out is garbled, need to use before the query: mysql_query ("SET NAMES GBK"); To set the MySQL connection code to ensure that the page declaration code is consistent with the connection code set here (GBK is an extension of GB2312). If the page is UTF-8 encoded, you can use: mysql_query ("SET NAMES Utf8″);"

Attention is UTF8 rather than general UTF-8. If the code of the page declaration is consistent with the internal encoding of the database, the connection code can be set.

Note: In fact, MySQL's data input and output is more complex than the above, the MySQL configuration file My.ini defines 2 default encodings, respectively, in [client] Default-character-set and [mysqld] Default-character-set to set the default client connection and the code used inside the database. The code we specify above is actually a MySQL Client Connection server command line parameter character_set_client, to tell the MySQL server received the client data is what encoding, rather than the default encoding.

2, page declaration code and the file itself is inconsistent with the code, this situation rarely occurs, because if the code inconsistent with the page when the browser to see is garbled. More often it is after the release to modify some small bugs, to open the page in error encoding and then save the resulting. Or with some FTP software directly online modify files, such as CuteFTP, due to software coding configuration errors caused by the conversion of the wrong encoding.

3, some rented virtual host friends, obviously the above 3 codes are set up correctly or garbled. For example, the Web page is GB2312 code, ie, such as browser Open but always recognized as UTF-8, the page head has been declared GB2312, manually modify the browser code for GB2312 after the page display normal. The reason is that the server Apache set the server global default code, in Httpd.conf added Adddefaultcharset UTF-8. At this time the server will first send HTTP headers to the browser, its priority than the page in the Declaration of High Code, the natural browser to identify the wrong. There are 2 solutions, ask the administrator to add a adddefaultcharset GB2312 to the configuration file's own virtual machine to override the global configuration, or configure it in the. htaccess of your own directory.

Summary: In short, to solve the PHP Chinese garbled the best solution is that the page statement code and the database internal code is consistent, if the page number of pages requested and the database internal coding inconsistent, set the connection code, mysql_query ("Set NAMES XXX"); XXX is a connection code. It will solve the problem of garbled characters.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.