//gbk编码下$s = '中文测试';echo mb_strlen($s, 'utf-8'); echo strlen(iconv('gbk', 'utf-8', $s));
1. Why values are not equal
Does 2.strlen calculate the number of bytes in a string or the number of characters?
What does the 3.mb_strlen calculate?
Reply content:
//gbk编码下$s = '中文测试';echo mb_strlen($s, 'utf-8'); echo strlen(iconv('gbk', 'utf-8', $s));
1. Why values are not equal
Does 2.strlen calculate the number of bytes in a string or the number of characters?
What does the 3.mb_strlen calculate?
$sThere are four wide characters in it, because you're in the GBK environment, that's 8 bytes. If you turn it into utf-8, it will take 12 bytes.
strlen()Returns the number of bytes taken
mb_strlen()Returns the actual number of characters if it is a wide character, he will calculate as a length.
So the following code
//gbk$s = '中文测试';$s_u8 = iconv('gbk', 'utf-8', $s);var_dump(strlen($s), strlen($s_u8), mb_strlen($s, 'gbk'), mb_strlen($s_u8, 'utf-8'));
The result is 8,12,4,4
It seems that the second sentence of the code is wrong, but the wrong one can get the same result. Don't know why
1. The landlord's mb_strlen() usage is not a bit small problem. Mb_strlen is the number of characters that are calculated based on how the string is encoded.
Whether the landlord is to use the following code:
长度都是4,所以首先确保传给mb_strlen的编码方式正确。
2. strlen() 只计算字节数
1, you print, the third line of the program executes ICONV after the $s is garbled, Because there was a bit of error in executing ICONV: detected an illegal character in input, you can change the encoding format of the PHP file to ANSI without this error.
2, when strlen is calculated, the Chinese character that treats a UTF8 is 3 length
3, when the Mb_strlen is calculated, the selected inner code is UTF8, a Chinese character is calculated as the length.