PHP uses the Mb_string function library to work with Windows related Chinese characters
Yesterday, I wanted to batch the previous download of a heap of files, the key contents of the file with a positive match, centralized processing. One problem with manipulating files is the coding problem in the Windows operating system.
We all know that in Windows (Chinese version of course), file name and file content encoding are GBK, and our development process, the IDE's code is UTF-8, (here does not discuss why and so on,
Only think about how to turn the code into the same) so that the Chinese in the UTF-8 encoded regular pattern string I wrote cannot match correctly in the GBK encoded file.
At first, I did not have any method, tried to the PHP script file encoding also changed to GBK, can also be used, but think of this method is too low, so look for PHP has a function to meet my needs.
At this point, I thought of the function iconv () used to work with the file name in Windows, and the function prototype was as follows:
String Iconv (String $in _charset, String $out _charset, String $str)
We often use:
$out _charset= ' utf-8 ';
$fileName =iconv ($fileName, $out _charset, ' GBK ');
To process the filename, change the file name from GBK to UTF-8 and the content unchanged.
Manual translation Attach:
If you add//translit that is $out_charset= ' utf-8//translit ' after the output string $out_charset, the program automatically replaces the UTF-8 character of a similar character when encountering characters that cannot be converted to UTF-8;
If you add//ignore that is $out_charset= ' utf-8//ignore ' after the output string $out_charset, the program automatically skips the character when it encounters a character that cannot be converted to UTF-8.
If you do not add anything, the replacement is interrupted when you encounter a character that cannot be replaced with UTF-8.
But when I'm working with this function, the result is this:
The Iconv () function can handle only 64 of the maximum number of characters, the general file name size, and my file content is clearly more than 64 characters.
There was no way, I had to look for another function all over again.
Until I found the Mb_string function library, which is generally integrated in the PHP environment, we can find it in phpinfo ().
There is a mb_convert_encoding () function in the Mb_string function that can change the encoding of a string, and its function prototype is as follows:
String mb_convert_encoding (String $str, String $to _encoding [, Mixed $from _encoding])
The base prototype is similar to the Iconv () function, except that it does not have a suffix modification to the output function, nor does it have a definite limit on the length of the string.
And we see that $from_encoding is optional, and it can automatically identify the source code.
Because cannot find an exact character that cannot be transcoding, also do not know that it encounters the word specifier how to handle without transcoding.
Through the mb_convert_encoding () function, the whole file is processed, so the problem is solved smoothly.
Finally, the Mb_string function library, which is named multibyte string, has many of its methods extended from PHP's own string function library, and the function name adds "Mb_" in front of the original function, which, in addition to the function of the original, Optionally, a $encoding optional parameter is added at the end of the optional parameter, which can specify what encoding the function will use to handle the string.
For example, the Strpos () function finds the position of a string in another string.
Strpos ("Welcome to visit", "Q", 0) The result returned is 12 because the script is UTF-8 encoded, and when the string is converted to UTF-8 encoding, each Chinese character occupies 3 bytes.
In the Mb_strpos () function, Mb_strpos ("Welcome to visit", "Ask", "0", "Utf-8") returns 4, which treats the string as if it had been transferred UTF-8.
and Mb_strpos ("Welcome to visit", "Ask", 0, ' GBK ') will return 6
Of course, it has more special places
Here to introduce the Windows environment to open the PHP mb_string method
A few days ago run a PHP program, need to turn character encoding, but a probe server, incredibly say does not support mb_string expansion. I checked the PHP extension library for Php_mbstring.dll this file.
Here's how you'll open it.
1. Make sure that you have php_mbstring.dll this file under your windows/system32, and do not copy it from your PHP installation directory extensions into windows/system32.
2. In the Windows directory to find php.ini open edit, search Mbstring.dll, find
; Extension=php_mbstring.dll
Then remove the front number and open support for the component
3. Restart the PHP service (if not you can restart the computer)
4. Complete