Issues with transcoding

Source: Internet
Author: User
Mb_convert_encoding (Addslashes ($u [+]), ' UTF-8 ', ' iso-8859-15,shift-jis,eucjp-win,sjis-win,iso-8859-1,utf-8 ')


This transcoding only the first valid, such as the first one is iso-8859-15, Western European code can be converted, but the Japanese code will not be able to turn
The first one is Eucjp-win,sjis-win. These Japanese codes can be turned around, and the Western European code does not turn.

Excuse me, how to deal with this situation???


Reply to discussion (solution)

The inner code of the character set list you gave has a intersection, not uniquely identifying
For this reason, the mb_string developer provides a mb_check_encoding function for you to judge individually

Do not think that the MB module is so intelligent
Without semantic analysis, I don't think the computer is iso-8859-1 or Shift-jis.

I know why. Each character equals the Japanese and western European codes, so the first encoding is valid

How do you deal with this situation like me?

The following are the European
Urbanizaci? Camino de Vi?les Calle Rio Aragon N9 Pinseque

The following are the Japanese
"???????? Iphone5/4s/4??? ?? ?????????? ?????? ??????? "

Mainly in Western Europe inside there is garbled will be judged as Japanese, if a character one character to judge that is not going to take a long time to deal with??

If I say
Urbanizaci? Camino de Vi?les Calle Rio Aragon N9 Pinseque
Is GBK code can you accept this statement?

$s = "???????? Iphone5/4s/4??? ?? ?????????? ?????? ??????? Echo mb_detect_encoding ($s, "ascii,jis,utf-8,euc-jp,sjis"), Echo mb_convert_encoding ($s, ' utf-8 ', ' SJIS '), Php_eol; $d = Explode (', ', ' SHIFT-JIS,EUCJP-WIN,SJIS-WIN,UTF-8,JIS,SJIS,EUC-JP,GBK '), foreach ($d as $t)  var_dump ($t, Mb_ Check_encoding ($s, $t));
SJISセカンドショップIPHONE5/4S/4 with repair decomposition tool star type ドライバ? スクレイパ? ネジ Pedestal Cup
String (9) "Shift-jis"
BOOL (TRUE)
String (9) "Eucjp-win"
BOOL (FALSE)
String (8) "Sjis-win"
BOOL (TRUE)
String (5) "Utf-8"
BOOL (FALSE)
String (3) "JIS"
BOOL (FALSE)
String (4) "Sjis"
BOOL (TRUE)
String (6) "EUC-JP"
BOOL (FALSE)
String (3) "GBK"
BOOL (TRUE)

Let me get this over with, haha.

 
  

Iso-8859-1urbanización Camino de Viñales Calle Rio Aragon N9 Pinsequeue
String (9) "Shift-jis"
BOOL (FALSE)
String (9) "Eucjp-win"
BOOL (FALSE)
String (8) "Sjis-win"
BOOL (TRUE)
String (5) "Utf-8"
BOOL (FALSE)
String (3) "JIS"
BOOL (FALSE)
String (4) "Sjis"
BOOL (FALSE)
String (6) "EUC-JP"
BOOL (FALSE)
String (3) "GBK"
BOOL (TRUE)
String (6) "Euc-kr"
BOOL (TRUE)
String (Ten) "Iso-8859-1"
BOOL (TRUE)

Not intentionally hit LZ, just want to explain a little: write a program to be thoughtful, can do things to do, make it as far as possible not wrong, after all, with the people who believe you

You're not going to mess it up. In the absence of a BOM indication, it is extremely difficult to identify a character set of a string
A unique recognition is possible only when a string contains a set of differences in a charset. And the string should be long enough

Upstairs two big, thank you.

I would like to explain that these are the data to be processed, the raw data we download from the national websites.

With your code, I think it's impossible to identify exactly what data we're dealing with with one or two functions.

So I'm now using a workaround, we have a field that is country-based, and the corresponding string is transferred according to the country.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.