Previously thought from top to bottom unified use of UTF-8 on the rest of the peace of mind, which knows today in the capture of Sina Weibo data or encounter character exceptions.
An exception is thrown when the data captured from Sina Weibo is stored in the database:
Incorrect string value: '\ xF0 \ x90 \ x8D \ x83 \ xF0 \ x90 ...'
It is found that the characters that cause exceptions are not traditional Chinese characters but some Buddhist scriptures... Artifact... But according to the principle UTF-8 should be able to support only right, he is not omnipotent?
The original problem lies in mysql, mysql if set encoding set to utf8 so it can only support up to 3 bytes of UTF-8 encoding, and 4 bytes of UTF-8 characters still exist, in this way, if you use the utf8 character set when creating a table, it is a matter of course.
The solution is simple. Modify the character set of a field or table to utf8mb4.
However, utf8mb4 is supported only after mysql 5.5.3.