Java read LEVEL-1 market dbf Files Extreme Optimization (2)

Source: Internet
Author: User
Tags rewind

The recent construction of a project to achieve market access and distribution, the need to achieve the ultimate low latency characteristics, which is very important for the security system. Access to the market source is configurable, either LEVEL-1 or Level-2 or other third-party sources. Although LEVEL-1 market is not Level-2 fast, but as a system to support the market source, we still need to optimize it, so that from the file read, to the user through the socket to receive the market, end-to-end latency as low as possible. This paper mainly introduces the extreme optimization scheme of LEVEL-1 market dbf file reading. Believe that the other DBF file read should also have reference significance.

LEVEL-1 market is by the market station, timed every few seconds to the DBF file (Shanghai is show2003.dbf, Shenzhen is sjshq.dbf) update again, with the new market replaced the old. Our goal is to read the files into memory in the shortest amount of time after the new files have been updated, convert each row into objects, and convert each column to the corresponding data type.

We have adopted 6 optimization methods altogether.

We describe 2 of the optimization strategies we use in the Java read Level-1 (1), which is optimized for DBF files:

Optimization one: Using memory hard disk (RamDisk) optimization two: Using jnotify, with notification instead of polling

This article continues to introduce:

Optimization three: Using NIO to read files

For DBF file read and write, there are many open source implementations, selecting and improving them is an important strategy here.

There are many DBF libraries that are based on stream-I/O implementations, namely InputStream and OutStream. We should adopt the NIO approach, which is based on Randomaccessfile,filechannel and Bytebuffer. Flow is to process the data while reading from the file, and NIO can load the entire file into memory at once. Tests have shown (see Java Program Performance Optimization) that NIO is about 5 times times faster than the way it flows. I am here to provide a DBF read library with NiO for everyone to download the study (the original source has not been tested.) This code is rewritten, which also includes the optimization strategy I will be proposing later, and if your project already has a DBF library, it is recommended that you improve it based on the optimization strategy in this article rather than directly replacing it with what I have provided.

Dbfreader Library

Among them, Dbfreader.java has the following code snippet:

Create the FileChannel code as:

 This New Randomaccessfile (file, "R"); this. fc = Dbf.getchannel ();

Loads the specified file fragment into the Bytebuffer code as

Private Bytebuffer loaddata (intintthrows  ioexception {        //  return Fc.map (mapmode.read_only, offset, length). Load ();        Bytebuffer B = bytebuffer.allocatedirect (length);        Fc.position (offset);        Fc.read (b);        B.rewind ();         return b;    }

Above, we use Bytebuffer.allocatedirect (length) to create the Bytebuffer. The Allocatedirect method creates a Directbuffer,directbuffer allocation in the "kernel buffer", which is one-fold faster than the normal bytebuffer, which also facilitates the optimization of our program. But the creation and destruction of Directbuffer is more time-consuming and will be addressed in our next optimizations.

(I'm not going to go into details about NiO (I might not be able to tell you), nor do I intend to detail Dbfreader.java's code, but focus on the performance-related parts, and that's what happens next. )

Optimization Four: Reduce memory redistribution and GC when reading files

The basic steps of the file read by the Dbfreader.java file I provided above are:

1, the entire file (except the file header) read into the Bytebuffer (in fact, Directbuffer)

2, and then read each line from Bytebuffer to byte[] array.

3, encapsulate these byte[] arrays in a single Record object (the Record object provides various methods for reading columns from byte[]).

See the following Loadrecordswithoutdel methods:

PrivateList<record> Loadrecordswithoutdel ()throwsioexception {bytebuffer bb= LoadData (Getdataindex (), GetCount () *getrecordlength ()); List<Record> rds =NewArraylist<record>(GetCount ());  for(inti = 0; I < GetCount (); i++) {             byte[] b = new byte  [Getrecordlength ()];            Bb.get (b); if((Char) b[0]! = ' * ') {Record R=NewRecord (b);            Rds.add (R);        }} bb.clear (); returnRDS; }

Private Bytebuffer loaddata (intintthrows  ioexception {        //  return Fc.map (mapmode.read_only, offset, length). Load ();        Bytebuffer B =  bytebuffer.allocatedirect (length);        Fc.position (offset);        Fc.read (b);        B.rewind ();         return b;    }

Consider the actual application of our system situation: The market DBF file will be refreshed every few seconds, the size of the refresh is basically the same, the format is exactly the same, the size of each row is the same.

Note that the highlighted sections of the above code create Bytebuffer and byte arrays over and over again. In our scenario, you can use a caching mechanism to reuse them and avoid creating them over and over again. To know that a market document has more than 5,000 lines, to avoid so many new and GC, it is certainly good for performance.

I added a CacheManager class to do the work:

ImportJava.nio.ByteBuffer;Importjava.util.ArrayList;Importjava.util.List; Public classCacheManager {PrivateBytebuffer Bytebuffer =NULL; Private intBufSize = 0; Privatelist<byte[]> bytearraylist =NULL; Private intBytessize = 0;  PublicCacheManager () {} PublicBytebuffer Getbytebuffer (intsize) {        if( This. bufSize <size) {Bytebuffer= Bytebuffer.allocatedirect (size + 1024*8);//allocate some more to avoid the next redistribution             This. bufSize = size + 1024*8;        } bytebuffer.clear (); returnBytebuffer; }         Publiclist<byte[]> Getbytearraylist (intRowNum,intbytelength) //     RowNum is the number of rows, which is required byte[], ByteLength is the size of the byte array {        if( This. bytessize!=bytelength) {Bytearraylist=Newarraylist<byte[]>();  This. bytessize =ByteLength; }                if(Bytearraylist.size () <rowNum) {            intShouldaddrowcount = Rownum-bytearraylist.size () +100;//Allocate 100 more rows             for(inti=0; i<shouldaddrowcount; i++) {Bytearraylist.add (New byte[bytessize]); }        }                returnbytearraylist; }    }

CacheManager manages a reusable bytebuffer and byte[] list that can be reused.

Where the Getbytebuffer method is used to return a cached Bytebuffer. The Bytebuffer is recreated only if the cached bytebuffer is less than the specified size. (in order to avoid this, we always allocate a larger number of bytebuffer than we actually need).

Where the Getbytearraylist method is used to return the cached byte[] list. Create more byte[] if the required number of byte[] is less than needed, and if the length of the cached byte[] is not the same as needed, recreate all byte[] (this is not possible because the size of each row does not change and the code is just in case).

Transform Loadrecordswithoutdel to recordswithoutdel_efficiently, using a caching mechanism:

 Publiclist<byte[]> recordswithoutdel_efficiently (CacheManager CacheManager)throwsioexception {bytebuffer bb= Cachemanager.getbytebuffer (GetCount () *getrecordlength ());        Fc.position (Getdataindex ());        Fc.read (BB);        Bb.rewind (); List<byte[]> rds =Newarraylist<byte[]>(GetCount ()); List<byte[]> bytearraylist =cachemanager.getbytearraylist (GetCount (), getrecordlength ());  for(inti = 0; I < GetCount (); i++) {            byte[] B =Bytearraylist.get (i);            Bb.get (b); if((Char) b[0]! = ' * ') {Rds.add (b);        }} bb.clear (); returnRDS; }

In the new recordswithoutdel_efficiently, we allocate the cached Bytebuffer and cached byte[] from the CacheManager. Instead of being allocated from the system, this reduces the amount of repeated memory allocations and GC. (also, recordswithoutdel_efficiently returns directly to the byte[] list, not the record list)

My test found that the optimization step four, that is, the use of the cache, probably the time from about 5ms to 2ms, and improve about a few times.

To this, we just finished the file to memory read.  Next, create a market object for each row, reading each column of data from byte[]. I found that it took much more time than file reads, and in the absence of optimizations, the conversion of more than 5,000 rows of data was over 70ms. This is the optimization strategy that we need to introduce next.

Cond...

Binhua Liu original article, reprint please specify the original address http://www.cnblogs.com/Binhua-Liu/p/5615299.html

Java read LEVEL-1 market dbf Files Extreme Optimization (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.