Evaluation on several methods of PHP's dynamic UTF-8 encoding for GB code

Source: Internet
Author: User
Tags foreach explode mysql mysql query ord strlen trim mysql database

In the "Evaluation of IP address-> geographical transformation", the article mentions that using the IP2ADDR function to read IP database files directly is the most efficient, compared with the MySQL database storage of IP data, SQL query is the least efficient. However, the IP database file QQWry.dat is GB2312 encoded. Now I need UTF-8 coded geographic results. If you use the MySQL method, you can convert the data to the UTF-8 code when it is stored in the database. But QQWry.dat file can not be modified, only the output of the IP2ADDR function can be dynamically converted.

There are at least four ways to dynamically convert GB->UTF-8 encoding:

Iconv extension Conversion with PHP

mb_string extension Conversion with PHP

Swap table to store in MySQL database

Swap table to store in a text file

The first two methods are required for the server to be set up (the appropriate extensions are compiled and installed) to be used. My virtual host does not have these two extensions, I have to consider the latter two methods. The first two methods are not evaluated in this article.

The evaluation procedure is as follows (func_ip.php see "Assessment of IP address-> Geographical transformation"):

<?php

Require_once ("func_ip.php");

function U2utf8 ($c) {

$str = "";

if ($c < 0x80) {

$str. = $c;

} elseif ($c < 0x800) {

$str. = Chr (0xc0 $c >> 6);

$str. = Chr (0x80 $c & 0x3F);

} elseif ($c < 0x10000) {

$str. = Chr (0xe0 $c >> 12);

$str. = Chr (0x80 $c >> 6 & 0x3F);

$str. = Chr (0x80 $c & 0x3F);

} elseif ($c < 0x200000) {

$str. = Chr (0xF0 $c >> 18);

$str. = Chr (0x80 $c >> & 0x3F);

$str. = Chr (0x80 $c >> 6 & 0x3F);

$str. = Chr (0x80 $c & 0x3F);

}

return $str;

}

function Gb2utf8_sql ($strGB) {

if (!trim ($strGB)) return $strGB;

$strRet = "";

$intLen = strlen ($strGB);

for ($i = 0; $i < $intLen; $i + +) {

if (Ord ($strGB {$i}) > 127) {

$strCurr = substr ($strGB, $i, 2);

$intGB = Hexdec (Bin2Hex ($strCurr))-0x8080;

$STRSQL = "Select Code_unicode from Nnstats_gb_unicode

WHERE CODE_GB = ". $intGB." LIMIT 1 "

;

$resResult = mysql_query ($STRSQL);

if ($arrCode = mysql_fetch_array ($resResult)) $strRet. = U2utf8 ($arrCode ["Code_unicode"]);

else $strRet. = "???";

$i + +;

} else {

$strRet. = $strGB {$i};

}

}

return $strRet;

}

function Gb2utf8_file ($strGB) {

if (!trim ($strGB)) return $strGB;

$arrLines = File ("Gb_unicode.txt");

foreach ($arrLines as $strLine) {

$arrCodeTable [Hexdec (substr ($strLine, 0, 6)] = Hexdec (substr ($strLine, 7, 6));

}

$strRet = "";

$intLen = strlen ($strGB);

for ($i = 0; $i < $intLen; $i + +) {

if (Ord ($strGB {$i}) > 127) {

$strCurr = substr ($strGB, $i, 2);

$intGB = Hexdec (Bin2Hex ($strCurr))-0x8080;

if ($arrCodeTable [$intGB]) $strRet. = U2utf8 ($arrCodeTable [$intGB]);

else $strRet. = "???";

$i + +;

} else {

$strRet. = $strGB {$i};

}

}

return $strRet;

}

function Encodeip ($strDotquadIp) {

$arrIpSep = Explode ('. ', $strDotquadIp);

if (count ($ARRIPSEP)!= 4) return 0;

$intIp = 0;

foreach ($arrIpSep as $k => $v) $intIp + + (int) $v * POW (256, 3-$k);

return $intIp;

return sprintf ('%02x%02x%02x%02x ', $arrIpSep [0], $arrIpSep [1], $ARRIPSEP [2], $ARRIPSEP [3]);

}

function Getmicrotime () {

List ($msec, $sec) = Explode ("", Microtime ());

Return (double) $msec + (double) $sec);

}

for ($i = 0; $i < $i + +) {//randomly generate 100 IP addresses

$strIp = Mt_rand (0, 255). ".". Mt_rand (0, 255). ".". Mt_rand (0, 255). ".". Mt_rand (0, 255);

$arrAddr [$i] = Ip2addr (Encodeip ($strIp));

}

$resConn = mysql_connect ("localhost", "netnest", "netnest");

mysql_select_db ("test");

Encoding conversion for evaluating MySQL queries

$dblTimeStart = Getmicrotime ();

for ($i = 0; $i < $i + +) {

$strUTF 8Region = Gb2utf8_sql ($arrAddr [$i] ["region"]);

$strUTF 8Address = Gb2utf8_sql ($arrAddr [$i] ["address"]);

}

$dblTimeDuration = Getmicrotime ()-$dblTimeStart;

Evaluation end and output results

Echo $dblTimeDuration; echo "\ r \ n";

Encoding conversion for evaluating text file queries

$dblTimeStart = Getmicrotime ();

for ($i = 0; $i < $i + +) {

$strUTF 8Region = Gb2utf8_file ($arrAddr [$i] ["region"]);

$strUTF 8Address = Gb2utf8_file ($arrAddr [$i] ["address"]);

}

$dblTimeDuration = Getmicrotime ()-$dblTimeStart;

Evaluation end and output results

Echo $dblTimeDuration; echo "\ r \ n";

?>

Evaluate two results (accurate to 3 decimal places, in seconds):

MySQL Query conversion: 0.112

Text Query conversions: 10.590

MySQL Query conversion: 0.099

Text Query conversion: 10.623 1 2 Next > Full text reading tips: Try the "←→" button, turning the page more convenient Oh!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.