Structure Analysis of pure IP database and a Query Class

Source: Internet
Author: User

 

A personal website provides a function to record visitors' IP addresses and locations. At first, I was lazy and used a WebService to Query IP addresses. Later, I thought that this method had a long response time, high resource consumption, and high dependence on that WebSerice, if it is down or due to network reasons, it is often returned after timeout. Therefore, I plan to directly query from the local pure IP address library.

The data structure of the pure library is detailed at http://lumaqq.linuxsir.org/article/qqwry_format_detail.html. Simply put, data files are divided into three areas:

1. File Header (8 bytes. The first 4 bytes point to the first record in the index area, and the last 4 bytes point to the last record in the Index Area)

2. Record area (a record contains IP addresses, country records, and region records. The last two records may be strings or redirection, and there are multiple redirection modes)

3. Index Area (an index is set to 7 bytes long, the first 4 bytes are IP addresses (little-endian), and the last 3 bytes point to the location of the corresponding record area, here the position refers to the offset byte calculated from the file header)

Although the database structure works well and the efficiency is fine, I think the design is a little complicated. In addition, if record A exists in the record area, it is redirected to record B. If I delete record B, there will be problems when querying record. Of course, it is a bit of trouble to delete record B. If you change the file structure to the following, it will be easier to handle it:

1. File Header (same as the original database)

2. String Area

3. Index Area (4-byte IP address, 4-byte offset value, and 4-byte offset value)

All strings are placed in the string area for unified management. The IP address in the index area, the "Pointer" of the country record, and the "Pointer" of the region record. The so-called "Pointer" corresponds to a string offset value in the string area.

 

However, since the pure IP library is designed in this way, I have to query it based on its structure.

Records in the index area are arranged from small to large and can be queried using the binary method.

The IP address indexed in the IP database is not consecutive. For example, the last record of 192.168.0.0 is not 192.168.0.1, which may be 192.169.0.0, that is, it stores an IP segment. So we need to make a process similar to rounding. Fortunately, in most cases, we only need to remove it. For example, to query 192.168.1.1, we should match 192.168.0.0 instead of 192.169.0.0.

 

Import java. io .*;

 

Public class IPSeeker

{

Protected RandomAccessFile ipDataFile;

Protected final int RECORD_LEN = 7;

Protected final int MODE_1 = 0x01; // redirect country records, region records

Protected final int MODE_2 = 0x02; // redirect country records with region records

Protected final int MODE_3 = 0x03; // default

Protected long indexBegin;

Protected long indexEnd;

Public IPSeeker () throws Exception

{

// Open the data file of the pure IP database

IpDataFile = new RandomAccessFile ("qqwry. dat", "r ");

IndexBegin = readLong (4, 0 );

IndexEnd = readLong (4, 4 );

}

Public static void main (String [] args) throws Exception

{

IPSeeker seeker = new IPSeeker (); // may throw Exception

String result = seeker. search ("111.2.13.4"); // enter the queried IP Address

System. out. println (result );

Seeker. close (); // close. If close is not called, it will be closed in finalize.

Seeker = null;

}

@ Override

Protected void finalize () throws Throwable

{

Try

{

IpDataFile. close ();

}

Catch (IOException e)

{

}

Super. finalize ();

}

Public void close ()

{

Try

{

IpDataFile. close ();

}

Catch (IOException e)

{

}

}

 

Public String search (String ipStr) throws Exception

{

// Use the bipartite Query

Long recordCount = (indexEnd-indexBegin)/7 + 1;

Long itemStart = 0;

Long itemEnd = recordCount-1;

Long ip = IPSeeker. stringIP2Long (ipStr );

Long middle = 0;

Long midIP = 0;

While (itemStart <= itemEnd)

{

Middle = (itemStart + itemEnd)/2;

MidIP = readLong (4, indexBegin + middle * 7 );

// String temp = IPSeeker. long2StringIP (midIP );

If (midIP = ip)

{

Break;

}

Else if (midIP <ip)

{

ItemStart = middle + 1;

}

Else // midIP> ip

{

ItemEnd = middle-1;

}

}

// If no exact match is found, the forward match is performed.

If (ip <midIP & middle> 0)

{

Middle --;

}

 

Long item = readLong (3, indexBegin + middle * 7 + 4 );

String [] result = getInfo (item + 4); // Retrieve Information

Return long2StringIP (readLong (4, indexBegin + middle * 7) + "," // matched IP address (segment)

+ Result [0] + "," // country

+ Result [1]; // Region

}

// Convert a 32-bit integer IP address (little-endian) to a string IP Address

Public static String long2StringIP (long ip)

{

Long ip4 = ip> 0 & 0x000000FF;

Long ip3 = ip> 8 & 0x000000FF;

Long ip2 = ip> 16 & 0x000000FF;

Long ip1 = ip> 24 & 0x000000FF;

Return String. valueOf (ip1) + "." + String. valueOf (ip2) + "." +

String. valueOf (ip3) + "." + String. valueOf (ip4 );

}

// Convert string-format IP addresses to 32-bit integer IP addresses (little-endian)

Public static Long stringIP2Long (String ipStr) throws Exception

{

String [] list = ipStr. split ("\\.");

If (list. length! = 4)

{

Throw new Exception ("ip address format error ");

}

Long ip = Long. parseLong (list [0]) <24 & 0xFF000000;

Ip + = Long. parseLong (list [1]) <16 & 0x00FF0000;

Ip + = Long. parseLong (list [2]) <8 & 0x0000FF00;

Ip + = Long. parseLong (list [3]) <0 & 0x000000FF;

Return ip;

}

// Read an n-bit

Private long readLong (int nByte, long offset) throws Exception

{

IpDataFile. seek (offset );

Long result = 0;

If (nByte> 4 | nByte <0)

Throw new Exception ("nBit shoshould be 0-4 ");

For (int I = 0; I <nByte; I ++)

{

Result | = (long) ipDataFile. readByte () <8 * I) & (0 xFFL <8 * I );

}

Return result;

}

Private String [] getInfo (long itemStartPos) throws Exception

{

// Result [0] Put in country, result [1] Put in Region

String [] result = new String [2];

IpDataFile. seek (itemStartPos );

Int mode = (int) ipDataFile. readByte ();

Switch (mode)

{

Case MODE_1:

{

Long offset = itemStartPos + 1;

Long redirPos = readLong (3, offset );

Result = getInfo (redirPos );

}

Break;

Case MODE_2:

{

Long offset = itemStartPos + 1;

Long redirPos = readLong (3, offset );

Result = getInfo (redirPos );

Result [1] = getArea (offset + 3 );

}

Break;

Default: // MODE_3

{

Long offset = itemStartPos;

Int countryLen = getStrLength (offset );

Result [0] = getString (offset, countryLen );

Offset = itemStartPos + countryLen + 1;

Result [1] = getArea (offset );

}

Break;

}

Return result;

}

Private String getArea (long offset) throws Exception

{

IpDataFile. seek (offset );

Int cityMode = (int) ipDataFile. readByte ();

If (cityMode = MODE_2 | cityMode = MODE_1)

{

Offset = readLong (3, offset + 1 );

}

Int cityLen = getStrLength (offset );

Return getString (offset, cityLen );

}

Private int getStrLength (long pos) throws IOException

{

IpDataFile. seek (pos );

Long strEnd = pos-1;

While (ipDataFile. readByte ()! = (Byte) 0)

{

StrEnd ++;

}

Return (int) (strEnd-pos + 1 );

}

Private String getString (long pos, int len) throws IOException

{

Byte buf [] = new byte [len];

IpDataFile. seek (pos );

IpDataFile. read (buf );

String s = new String (buf, "gbk ");

Return s;

}

 

}

 

 

 

This article is from the "Mu youcun technology blog" blog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.