Broadband IP Address query application of PHP dichotomy method in IP address query

Source: Internet
Author: User
Tags fread mysql query unpack
The database stores approximately hundreds of thousands of IP records, with the following record set:
+----------+----------+------------+---------+---------+--------+--------+
| Ip_begin | Ip_end | country_id | prov_id | city_id | isp_id | Netbar |
+----------+----------+------------+---------+---------+--------+--------+
| 0 | 16777215 | 2 | 0 | 0 | 0 | 0 |
| 16777216 | 33554431 | 2 | 0 | 0 | 0 | 0 |
| 33554432 | 50331647 | 2 | 0 | 0 | 0 | 0 |
| 50331648 | 67108863 | 3 | 0 | 0 | 0 | 0 |
| 67108864 | 67829759 | 3 | 0 | 0 | 0 | 0 |
+----------+----------+------------+---------+---------+--------+--------+
The following SQL is required for this query:
$sql = ' SELECT * from i_m_ip WHERE ip_begin <= $client _ip and ip_end >= $client _ip ';
?>
Such a search is obviously not indexed, even if used, MySQL query efficiency is not likely to reach more than 500 times per second, I do a lot of concurrency optimization, the final average query efficiency is only about 200 times per second, is really a headache. At first I also had the idea to learn from the pure IP Library retrieval method, but I have always been inconsistent with the algorithm, also thought that the dichotomy method is difficult, so there is no attempt to use, until finally there is no way to finally achieve the two-point IP address retrieval.
From the table above you can see that the IP library is a continuous number from 0 to 4294967295, if the value is opened to store, there will be hundreds of g of data, so there is no way to use the index or hash. Eventually I used PHP to turn these things into binary storage, discarding the retrieval of the database. You can see the IP end-to-end length is a 4-byte long Integer, the following country ID, province ID, etc., can be stored using a 2-byte short integer, a total row of data has 18 bytes, a total of 310,000 data, counting up to 5 m appearance. The specific IP library generates the following code:
/*
IP file Format:
3741319168 3758096383 182 0 0 0 0
3758096384 3774873599 3 0 0 0 0
3774873600 4026531839 182 0 0 0 0
4026531840 4278190079 182 0 0 0 0
4294967040 4294967295 312 0 0 0 0
*/
Set_time_limit (0);
$handle = fopen ('./ip.txt ', ' RB ');
$fp = fopen ("./ip.dat", ' ab ');
if ($handle) {
while (!feof ($handle)) {
$buffer = fgets ($handle);
$buffer = Trim ($buffer);
$buffer = explode ("\ t", $buffer);
foreach ($buffer as $key = = $value) {
$buffer [$key] = (float) trim ($value);
}
$str = Pack (' L ', $buffer [0]);
$str. = Pack (' L ', $buffer [1]);
$str. = Pack (' S ', $buffer [2]);
$str. = Pack (' S ', $buffer [3]);
$str. = Pack (' S ', $buffer [4]);
$str. = Pack (' S ', $buffer [5]);
$str. = Pack (' S ', $buffer [6]);
Fwrite ($fp, $STR);
}
}
?>
This way the IP is arranged in order of 18 bytes per unit, so it is easy to use the binary method to retrieve the IP information:
function GetIP ($ip, $fp) {
Fseek ($fp, 0);
$begin = 0;
$end = FileSize ('./ip.dat ');
$begin _ip = Implode ("', Unpack (' L ', Fread ($FP, 4)));
Fseek ($FP, $end-14);
$end _ip = Implode ("', Unpack (' L ', Fread ($FP, 4)));
$begin _ip = sprintf ('%u ', $begin _ip);
$end _ip = sprintf ('%u ', $end _ip);
do {
if ($end-$begin <= 18) {
Fseek ($fp, $begin + 8);
$info = Array ();
$info [0] = implode (' ', Unpack (' S ', Fread ($FP, 2)));
$info [1] = implode (' ', Unpack (' S ', Fread ($FP, 2)));
$info [2] = implode (' ', Unpack (' S ', Fread ($FP, 2)));
$info [3] = implode (' ', Unpack (' S ', Fread ($FP, 2)));
$info [4] = implode (' ', Unpack (' S ', Fread ($FP, 2)));
return $info;
}
$middle _seek = Ceil (($end-$begin)/2) * + $begin;
Fseek ($fp, $middle _seek);
$middle _ip = Implode ("', Unpack (' L ', Fread ($FP, 4)));
$middle _ip = sprintf ('%u ', $middle _ip);
if ($ip >= $middle _ip) {
$begin = $middle _seek;
} else {
$end = $middle _seek;
}
} while (true);
}
The above $fp for open Ip.dat file handle, because is circular retrieval, so write in the function outside, lest each retrieval must open once the file, 30W row data dichotomy method also only need to loop 7 times (2^7) around can find accurate IP information. Then I would like to put the ip.dat in memory to speed up the retrieval, and later found that the efficiency of the string positioning function, the root and the file pointer offset positioning is not in an order of magnitude, so or discard the use of memory to hold the IP library.
This realization, makes the IP retrieval efficiency to raise nearly hundred times, is only a simple dichotomy application, from this algorithm in the Web application The unimportant idea completely eliminates. In fact, to achieve this, I also asked the Golden Fox, I began to ask him to help me to generate a pure form of IP Library, and then use Discuz IP query function to retrieve, but he refused to help me, and finally created my practice and learning. Sometimes, begging is better than asking for yourself.

The above describes the broadband IP address query php binary method in the IP address query, including the broadband IP address query content, I hope the PHP tutorial interested in a friend helpful.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.