Using PHP to implement encoding conversion between GB2312 and Unicode _php base

Source: Internet
Author: User

encoding conversion between gb2312 and Unicode
The following example is to convert gb2312 to "whole" in this form
php4.3.1 later Iconv function is very useful, just need to write a uft8 to Unicode conversion function
Tabular (gb2312.txt) is fine.

Copy Code code as follows:

?
$text = "cloud-dwelling community";
Preg_match_all ("/[\x80-\xff]?" /", $text, $ar);
foreach ($ar [0] as $v)
echo "&#". Utf8_unicode (Iconv ("GB2312", "UTF-8", $v)). ";
?>
?
UTF8-> Unicode
function Utf8_unicode ($c) {
Switch (strlen ($c)) {
Case 1:
Return ord ($c);
Case 2:
$n = (ord ($c [0]) & 0x3f) << 6;
$n + + ord ($c [1]) & 0x3f;
return $n;
Case 3:
$n = (ord ($c [0]) & 0x1f) << 12;
$n + = (ord ($c [1]) & 0x3f) << 6;
$n + + ord ($c [2]) & 0x3f;
return $n;
Case 4:
$n = (ord ($c [0]) & 0x0f) << 18;
$n + = (ord ($c [1]) & 0x3f) << 12;
$n + = (ord ($c [2]) & 0x3f) << 6;
$n + + ord ($c [3]) & 0x3f;
return $n;
}
}
?>

The following example uses PHP to convert the "all" encoding to gb2312.
Copy Code code as follows:

<?php
$str = "TTL all-weather automatic Focus";
$str = Preg_replace ("|&# ([0-9]{1,5}); |", "\". U2UTF82GB (\\1). \ "", $str);
$str = "\ $str =\" $str \ ";";
eval ($STR);
Echo $str;
function U2UTF82GB ($c) {
$str = "";
if ($c < 0x80) {
$str. = $c;
else if ($c < 0x800) {
$str. =CHR (0xc0 | $c >>6);
$str. =CHR (0x80 | $c & 0x3F);
else if ($c < 0x10000) {
$str. =CHR (0xe0 | $c >>12);
$str. =CHR (0x80 | $c >>6 & 0x3F);
$str. =CHR (0x80 | $c & 0x3F);
else if ($c < 0x200000) {
$str. =CHR (0xF0 | $c >>18);
$str. =CHR (0x80 | $c >>12 & 0x3F);
$str. =CHR (0x80 | $c >>6 & 0x3F);
$str. =CHR (0x80 | $c & 0x3F);
}
Return Iconv (' UTF-8 ', ' GB2312 ', $str);
}
?>

Or is
Copy Code code as follows:

function Unescape ($STR) {
$str = Rawurldecode ($STR);
Preg_match_all ("/(?:%u.{4}) |& #x. {4};|&#\d+;|.+/u", $str, $r);
$ar = $r [0];
Print_r ($ar);
foreach ($ar as $k => $v) {
if (substr ($v, 0,2) = = "%u")
$ar [$k] = Iconv ("UCS-2", "GB2312", Pack ("H4", substr ($v,-4));
ElseIf (substr ($v, 0,3) = = "& #x")
$ar [$k] = Iconv ("UCS-2", "GB2312", Pack ("H4", substr ($v, 3,-1));
ElseIf (substr ($v, 0,2) = = "&#") {
Echo substr ($v, 2,-1). " <br> ";
$ar [$k] = Iconv ("UCS-2", "GB2312", Pack ("n", substr ($v, 2,-1));
}
}
return join ("", $ar);
}
$str = "TTL all-weather automatic Focus";
echo unescape ($STR); Out TTL auto focus

Use JavaScript to convert
Copy Code code as follows:

<style>
Body {
font-size:9pt; padding-right:0px; padding-left:0px; padding-bottom:0px; padding-top:0px;
}
Input {
font-size:9pt; height:13pt;
}
</style>
<script language= "JavaScript1.2" >
/*
This following code are designed and writen by Windy_sk <seasonx@163.net>
can use it freely, but u must held all the copyright items!
*/
function Str2unicode (str) {
var arr = new Array ();
for (Var i=0;i<str.length;i++) {
Arr[i]= "&#" + str.charcodeat (i) + ";";
}
Return (Arr.tostring (). Replace (/,/g, ""));
}
function Unicode2ostr (str) {
var re=/&#[\da-fa-f]{1,5};/ig;
var arr=str.match (re);
if (arr==null) return ("");
for (Var i=0;i<arr.length;i++) {
Arr[i]=string.fromcharcode (Arr[i].replace (/[&#;] /g, ""));
}
Return (Arr.tostring (). Replace (/,/g, "")
}
function Modi_str () {
if (document.all.text.method.checked) {
if (document.all.text.decode.value!= "") {
Document.all.text.encode.value = Str2unicode (Document.all.text.decode.value);
}else{
Document.all.text.decode.value = Unicode2ostr (Document.all.text.encode.value);
}
}else{
if (document.all.text.encode.value!= "") {
Document.all.text.decode.value = Unicode2ostr (Document.all.text.encode.value);
}else{
Document.all.text.encode.value = Str2unicode (Document.all.text.decode.value);
}
}
}
</script>
<title>Unicode</title>
<form name=text>
Text prototype:<br>
<textarea name= "decode" cols= "rows=" ></textarea>
<br>
Convert Code:<br>
<textarea name= "encode" cols= "rows=" ></textarea>
<br>
<input type= "checkbox" Name= "method" checked> forward conversion
<input Type=button onclick= "Modi_str ()" value= "OK" >
<input type=reset value= "emptying" >
<input Type=button onclick= "Document.all.text.method.checked?document.all.text.encode.select (): Document.all.text.decode.select () "value=" "All Choice" >
</form>

Here is an example of a view that shows all the full-width half-width fonts
Copy Code code as follows:

<style>
Body {
font-size:9pt; padding-right:0px; padding-left:0px; padding-bottom:0px; padding-top:0px;
}
Input {
font-size:9pt; height:13pt;
}
</style>
<script>
function Showuni (Min,max) {
Show.document.open ();
Show.document.writeln ("<style>body{font-size:9pt;word-break:break-all;}" </style> ");
Show.document.writeln (min + "-" + Max + "<br><br>");
var i=0;
for (i=min;i<=max;i++) {
Show.document.write ("&#" + i + ";");
}
Show.document.close ();
}
</script>
<input Type=button value= "Half-angle" Onclick=showuni (32,126) >
<input Type=button value= "All Corners" Onclick=showuni (65281,65374) >
<input Type=button value= "Chinese 1" onclick=showuni (19968,40869) >
<input Type=button value= "Chinese 2" Onclick=showuni (63744,64045) >
<input Type=button value= "Japanese flat" Onclick=showuni (12353,12435) >
<input Type=button value= "Japanese film" Onclick=showuni (12449,12534) >
<input Type=button value= "Han Wen" Onclick=showuni (44032,55203) >
<br> customization: <input name=min>-<input name=max>
<input Type=button value= "View" Onclick=showuni (parseint (Document.all.min.value), parseint (document.all.max.value )) >
<br>
<iframe src= "About:blank" Id=show width=100% height=70% scroll=no></iframe>

Here is an example of a look-up table (gb2312), a conversion gb2312 to a utf8, and now there is the ICONV function, which has not much meaning anymore,
Copy Code code as follows:

?
function Gb2utf8 ($GB) {
if (!trim ($GB)) return $GB;
$filename = "Gb2312.txt";
$tmp =file ($filename);
$codetable =array ();
while (list ($key, $value) =each ($tmp))
$codetable [Hexdec (substr ($value, 0,6))]=substr ($value, 7,6);
$utf 8 = "";
while ($GB) {
if (Ord (substr ($GB, 0,1)) >127) {
$this =substr ($GB, 0,2);
$GB =substr ($GB, 2,strlen ($GB)-2);
$utf 8.=u2utf8 (Hexdec ($codetable [Hexdec (Bin2Hex ($this)) -0x8080]);
}else{
$this =substr ($GB, 0, 1);
$GB =substr ($GB, 1,strlen ($GB)-1);
$utf 8.=u2utf8 ($this);
}
}
return $UTF 8;
}
function U2utf8 ($c) {
$str = "";
if ($c < 0x80) {
$str. = $c;
else if ($c < 0x800) {
$str. =CHR (0xc0 | $c >>6);
$str. =CHR (0x80 | $c & 0x3F);
else if ($c < 0x10000) {
$str. =CHR (0xe0 | $c >>12);
$str. =CHR (0x80 | $c >>6 & 0x3F);
$str. =CHR (0x80 | $c & 0x3F);
else if ($c < 0x200000) {
$str. =CHR (0xF0 | $c >>18);
$str. =CHR (0x80 | $c >>12 & 0x3F);
$str. =CHR (0x80 | $c >>6 & 0x3F);
$str. =CHR (0x80 | $c & 0x3F);
}
return $str;
}
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.