Subject:
Currently the most popular version of J2SDK is the 1.3 series. Developers who use this version need to use the Randomaccessfile class for random file access. Its I/O performance is far from similar performance in other common development languages, which seriously affects the operational efficiency of the program.
Developers urgently need to improve efficiency, the following analysis of Randomaccessfile and other file class source code, find out the crux of the problem, and improve the optimization, create a "sex/price ratio" is a good random file access class Bufferedrandomaccessfile.
Make a basic test before you improve: Copy a 12-megabyte file byte by bit (this involves reading and writing).
Read |
Write |
elapsed Time (seconds) |
Randomaccessfile |
Randomaccessfile |
95.848 |
Bufferedinputstream + DataInputStream |
Bufferedoutputstream + DataOutputStream |
2.935 |
We can see the difference between the two is about 32 times times, Randomaccessfile is too slow. First, look at the source code of the two key parts, compare and analyze, find out the reason.
1. 1. [Randomaccessfile]
public class Randomaccessfile implements DataOutput, Datainput {public final byte readbyte () throws ioexception {int ch = This.read (); if (Ch < 0) throw new Eofexception (); return (byte) (CH);} public native int read () throws IOException; Public final void WriteByte (int v) throws IOException {Write (v),} public native void write (int b) throws IOException; }
As you can see, randomaccessfile requires one-time I/O operations on the disk per read/write byte.
1. 2. [Bufferedinputstream]
public class Bufferedinputstream extends FilterInputStream {private static int defaultbuffersize = 2048; protected byte bu F[]; Create read buffer public Bufferedinputstream (inputstream in, int size) {super (in); if (size <= 0) {throw new IllegalArgumentException ("Buffer size <= 0");} BUF = new Byte[size];} public synchronized int read () throws IOException {Ensureopen (); if (POS >= count) {fill (); if (POS >= count) return-1 ;} Return buf[pos++] & 0xFF; Read directly from buf[]} private void Fill () throws IOException {if (Markpos < 0) pos = 0;/* No mark:throw away the buffer * /else if (pos >= buf.length)/* No, left in buffer */if (Markpos > 0) {/* can throw away early part of the BU Ffer */int sz = pos-markpos; System.arraycopy (buf, Markpos, buf, 0, sz);p os = Sz;markpos = 0; } else if (buf.length >= marklimit) {Markpos = -1;/* buffer got too big, invalidate mark */pos = 0;/* Drop buffer conte NTS */} else {/* grow buffer */int Nsz = pos * 2;IF (Nsz > MarklimiT) Nsz = marklimit;byte nbuf[] = new Byte[nsz]; System.arraycopy (buf, 0, nbuf, 0, pos); buf = Nbuf; }count = pos;int n = in.read (buf, POS, Buf.length-pos); if (n > 0) Count = n + pos;}}
1. 3. [Bufferedoutputstream]
public class Bufferedoutputstream extends Filteroutputstream { protected byte buf[];//Create write buffer public Bufferedoutputstream (outputstream out, int size) {super (out); if (size <= 0) {throw new IllegalArgumentException (" Buffer size <= 0 ");} BUF = new Byte[size]; } Public synchronized void write (int b) throws IOException {if (Count >= buf.length) { flushbuffer ();} buf[count++] = (byte) b; Read directly from buf[] } private void Flushbuffer () throws IOException {if (Count > 0) {out.write (buf, 0, Count); count = 0;}} }
Visible, Buffered I/O putstream each read/write a byte, to manipulate the data in buf, directly to the memory of the buf[] read/write operations, otherwise from the disk corresponding to the location of buf[], and then directly to the memory of the buf[] read/write operations, the vast majority of read/ The write operation is a memory buf[] operation.
1. 3. Summary
The Memory Access time unit is the nanosecond level (10E-9), the disk access time unit is the millisecond level (10E-3), the same operation once the cost, the memory is faster than the disk millions. In theory, even tens of thousands of memory operations can be expected to take much less time for disk I/O overhead. Obviously the latter is by increasing the BUF access in memory, reducing the overhead of disk I/O, increasing the efficiency of access, and of course increasing the overhead of the BUF control section. In practical applications, access efficiency has been increased by 32 times times.
Back to top of page
According to the conclusion of 1.3, we are trying to add a buffer reading and writing mechanism to the Randomaccessfile class.
The random access class differs from the sequential class by implementing the Datainput/dataoutput interface, which is created by the extension Filterinputstream/filteroutputstream and cannot be copied directly.
2. 1. Open buffer buf[Default: 1024 bytes], used as a common buffer for read/write.
2. 2. Read Buffering is implemented first.
Fundamentals of Read buffer logic:
A a byte to read the file pos position.
B check buf in existence? If so, read directly from the BUF and return the character byte.
If not, buf repositions to the POS location and fills buffer with the file contents of the BufSize byte near the location, returning B.
The following is a key part of the code and its description:
public class Bufferedrandomaccessfile extends Randomaccessfile {//byte read (long POS): reads the current file pos location in bytes//Bufstartpos, B Ufendpos represents the BUF map at the first/end offset address of the current file. CurPos refers to the offset address of the current class file pointer. Public byte read (long pos) throws IOException {if (pos < This.bufstartpos | | pos > This.bufendpos) { This.flushbuf (); This.seek (POS); if (pos < This.bufstartpos) | | (pos > This.bufendpos)) throw new IOException (); } This.curpos = pos; return this.buf[(int) (POS-THIS.BUFSTARTPOS)]; }//void Flushbuf (): Bufdirty is true, writes to disk the data that has not been written to the disk in buf[]. private void Flushbuf () throws IOException {if (This.bufdirty = = True) {if (Super.getfilepointer ()! = This.bufstartpos) {Super.seek (this.bufstartpos); } super.write (this.buf, 0, this.bufusedsize); This.bufdirty = false; }//void Seek (Long POS): Moves the file pointer to the POS location and populates the buf[] map with the file block where the POS resides. public void Seek (long pOS) throws IOException {if (pos < This.bufstartpos) | | (pos > This.bufendpos)) {//Seek pos not in buf this.flushbuf (); if (POS >= 0) && (pos <= this.fileendpos) && (This.fileendpos! = 0)) {//Seek pos in file (file Length > 0) this.bufstartpos = pos * Bufbitlen/bufbitlen; This.bufusedsize = This.fillbuf (); } else if ((pos = = 0) && (This.fileendpos = = 0)) | | (pos = = This.fileendpos + 1)) {//Seek pos is append pos this.bufstartpos = pos; this.bufusedsize = 0; } This.bufendpos = This.bufstartpos + this.bufsize-1; } This.curpos = pos; }//int fillbuf (): According to Bufstartpos, fill buf[]. private int Fillbuf () throws IOException {Super.seek (this.bufstartpos); This.bufdirty = false; Return Super.read (THIS.BUF); }}
This buffer reads the basic implementation, byte by copy a 12 trillion file (here is involved in reading and writing, with Bufferedrandomaccessfile try to read the speed):
Read |
Write |
elapsed Time (seconds) |
Randomaccessfile |
Randomaccessfile |
95.848 |
Bufferedrandomaccessfile |
Bufferedoutputstream + DataOutputStream |
2.813 |
Bufferedinputstream + DataInputStream |
Bufferedoutputstream + DataOutputStream |
2.935 |
The visible speed is significantly increased, comparable to bufferedinputstream+datainputstream.
2. 3. Implement write Buffering.
Basic principles of Write buffer logic:
A byte to write the file pos position.
B Check if there is a mapping in BUF? If so, write directly to BUF and return true.
If not, buf repositions to the POS location and fills buffer with the contents of the BufSize byte near the location, returning B.
Here is the key part of the code and its description:
Boolean write (Byte bw, long POS): Writes byte bw to the current file POS location. According to the different POS and BUF location: There are modifications, additions, buf, buf outside the situation. In the logical judgment, the most probable situation, the first judgment, this can increase the speed. Fileendpos: Indicates the end offset address of the current file, mainly considering the append factor public boolean write (Byte bw, long POS) throws IOException {if (pos >= This.bufstartpos) && (pos <= this.bufendpos)) {//write pos in buf this.buf[(int) (POS-THIS.BUFST ARTPOS)] = BW; This.bufdirty = true; if (pos = = This.fileendpos + 1) {//write POS is append pos this.fileendpos++; this.bufusedsize++; }} else {//write pos not in BUF This.seek (POS); if (POS >= 0) && (pos <= this.fileendpos) && (This.fileendpos! = 0)) {//write POS is modify file this.buf[(int) (POS-THIS.BUFSTARTPOS)] = BW; } else if ((pos = = 0) && (This.fileendpos = = 0)) | | (pos = = This.fileendpos + 1)) {//write POS is append pos this.buf[0] = BW; this.fileendpos++; This.bufusedsize = 1; } else {throw new indexoutofboundsexception (); } This.bufdirty = true; } This.curpos = pos; return true; }
This buffer writes the basic implementation, byte by copy a 12 trillion file, (here is involved in reading and writing, combined with buffer reading, with Bufferedrandomaccessfile test read/write speed):
Read |
Write |
elapsed Time (seconds) |
Randomaccessfile |
Randomaccessfile |
95.848 |
Bufferedinputstream + DataInputStream |
Bufferedoutputstream + DataOutputStream |
2.935 |
Bufferedrandomaccessfile |
Bufferedoutputstream + DataOutputStream |
2.813 |
Bufferedrandomaccessfile |
Bufferedrandomaccessfile |
2.453 |
The integrated read/write speed is beyond Bufferedinput/outputstream+datainput/outputstream.
Back to top of page
Optimize Bufferedrandomaccessfile.
Optimization principle:
- Frequently called statements need optimization, and the effect of optimization is most obvious.
- When multiple nesting logic is judged, the most likely judgment should be placed on the outermost layer.
- Reduce unnecessary new.
Here is a typical example:
public void Seek (long pos) throws IOException {... this.bufstartpos = pos * bufbitlen/bufbitlen;//Bufbitlen refers to buf[] bits Long, example: if bufsize=1024, then bufbitlen=10. ...}
The Seek function is used in each function, the call is very frequent, the above-mentioned line of statements based on POS and bufsize to determine buf[] corresponding to the current file mapping location, with "*", "/" is definitely not a good method.
Optimization one: This.bufstartpos = (pos << bufbitlen) >> Bufbitlen;
Optimization two: This.bufstartpos = pos & bufmask; This.bufmask = ~ ((long) this.bufsize-1);
Both are more efficient than the original, but the latter is obviously better, because the former requires two shift operations, the latter requiring only one logical AND operation (Bufmask can be pre-drawn).
This optimizes the basic implementation, byte by copy of a 12 trillion file, (here is involved in reading and writing, combined with buffer reading, with the optimization after Bufferedrandomaccessfile test read/write speed):
Read |
Write |
elapsed Time (seconds) |
Randomaccessfile |
Randomaccessfile |
95.848 |
Bufferedinputstream + DataInputStream |
Bufferedoutputstream + DataOutputStream |
2.935 |
Bufferedrandomaccessfile |
Bufferedoutputstream + DataOutputStream |
2.813 |
Bufferedrandomaccessfile |
Bufferedrandomaccessfile |
2.453 |
Bufferedrandomaccessfile Excellent |
Bufferedrandomaccessfile Excellent |
2.197 |
Visible optimizations, though not obvious, are a bit faster than they were before optimization, and perhaps the effect will be more pronounced on older machines.
The above comparison is sequential access, even random access, in most cases there is more than one byte, so the buffering mechanism is still valid. In general, sequential access classes are not as easy to implement random access.
Back to top of page
A place to be perfected
Provides file append functionality:
public boolean append (Byte bw) throws IOException { return this.write (BW, This.fileendpos + 1); }
Provides the file current location modification function:
Public boolean write (Byte bw) throws IOException { return this.write (BW, this.curpos); }
Returns the file length (different from the original Randomaccessfile class due to buf read and write):
public long Length () throws IOException { return This.max (This.fileendpos + 1, this.initfilelen); }
Returns the current pointer to the file (different from the original Randomaccessfile class due to the reason for reading and writing through BUF):
Public long Getfilepointer () throws IOException { return this.curpos; }
Provides buffer write functionality for multiple bytes at the current location:
public void Write (byte b[], int off, int len) throws IOException { long writeendpos = This.curpos + len-1; if (Writeendpos <= this.bufendpos) {//b[] in cur bufsystem.arraycopy (b, off, THIS.BUF, (int) (THIS.CURPOS-THIS.BUFST Artpos), Len); This.bufdirty = true; this.bufusedsize = (int) (Writeendpos-this.bufstartpos + 1); } else {//b[] not in cur buf super.seek (this.curpos); Super.write (b, off, Len); } if (Writeendpos > This.fileendpos) this.fileendpos = Writeendpos; This.seek (writeendpos+1);} public void Write (byte b[]) throws IOException { this.write (b, 0, b.length); }
Provides buffered read functionality for multiple bytes at the current location:
public int read (byte b[], int off, int len) throws IOException {Long Readendpos = This.curpos + len-1; if (readendpos <= this.bufendpos && readendpos <= this.fileendpos) {//Read in buf system.arraycopy (th IS.BUF, (int) (THIS.CURPOS-THIS.BUFSTARTPOS), B, off, Len); } else {//read b[] size > buf[] if (Readendpos > This.fileendpos) {//Read b[] part in file Len = (int) (th Is.length ()-this.curpos + 1); } super.seek (This.curpos); Len = Super.read (b, off, Len); Readendpos = This.curpos + len-1; } this.seek (Readendpos + 1); return Len;} public int read (byte b[]) throws IOException {return This.read (b, 0, b.length); }public void SetLength (Long newlength) throws IOException {if (newlength > 0) {this.fileendpos = NE WLength-1; } else {this.fileendpos = 0; } super.setlength (newlength);} public void Close () throws IOException {This.flusHbuf (); Super.close (); }
At this point to complete the completion of the work, try the new multi-byte read/write function, by simultaneously read/write 1024 bytes, to copy a 12 trillion files, (here is involved in reading and writing, with perfect after bufferedrandomaccessfile test read/write speed):
Read |
Write |
elapsed Time (seconds) |
Randomaccessfile |
Randomaccessfile |
95.848 |
Bufferedinputstream + DataInputStream |
Bufferedoutputstream + DataOutputStream |
2.935 |
Bufferedrandomaccessfile |
Bufferedoutputstream + DataOutputStream |
2.813 |
Bufferedrandomaccessfile |
Bufferedrandomaccessfile |
2.453 |
Bufferedrandomaccessfile Excellent |
Bufferedrandomaccessfile Excellent |
2.197 |
Bufferedrandomaccessfile Finish |
Bufferedrandomaccessfile Finish |
0.401 |
Back to top of page
Compare with JDK1.4 new class Mappedbytebuffer+randomaccessfile?
JDK1.4 provides the NIO class, where the Mappedbytebuffer class is used to map buffers or to map random file accesses, and the Java designer sees the randomaccessfile problem and improves it. How to copy files through Mappedbytebuffer+randomaccessfile? Here is the main part of the test program:
Randomaccessfile Rafi = new Randomaccessfile (srcfile, "R"); Randomaccessfile Rafo = new Randomaccessfile (Desfile, "RW"); FileChannel FCI = Rafi.getchannel (); FileChannel FCO = Rafo.getchannel (); Long size = Fci.size (); Mappedbytebuffer Mbbi = Fci.map (FileChannel.MapMode.READ_ONLY, 0, size); Mappedbytebuffer Mbbo = Fco.map (FileChannel.MapMode.READ_WRITE, 0, size), long start = System.currenttimemillis (); for ( int i = 0; i < size; i++) { byte B = mbbi.get (i); Mbbo.put (i, b);} Fcin.close (); Fcout.close (); Rafi.close (); Rafo.close (); System.out.println ("Spend:" + (Double) (System.currenttimemillis ()-start)/+ + "s");
Try the map buffer read/write function of JDK1.4, copy a 12-megabyte file byte by bit, (this involves reading and writing):
read |
write |
elapsed time (seconds) |
randomaccessfile |
randomaccessfile |
95.848 |
Bufferedinputstream + datainputstream |
bufferedoutputstream + dataoutputstream |
2.935 |
bufferedrandomaccessfile |
bufferedoutputstream + dataoutputstream |
2.813 |
bufferedrandomaccessfile |
bufferedrandomaccessfile |
2.453 |
bufferedrandomaccessfile |
bufferedrandomaccessfile excellent |
2.197 |
bufferedrandomaccessfile |
bufferedrandomaccessfile end |
0.401 |
mappedbytebuffer+ randomaccessfile |
mappedbytebuffer+ randomaccessfile |
1.209 |
Really good, it seems that JDK1.4 than 1.3 has made great progress. If you want to use the 1.4 version of the development software later, you need to random access to the file, it is recommended to adopt Mappedbytebuffer+randomaccessfile way. However, in view of the fact that most of the programs developed with JDK1.3 and previous versions are currently in use, if you have developed a Java program that uses the Randomaccessfile class to randomly access files, and because of its poor performance, and is concerned about the user's criticism, try the bufferedrandomaccess provided in this article. File class, do not have to overturn the rewrite, just import this class, all the Randomaccessfile to Bufferedrandomaccessfile, your program's performance will be greatly improved, you have to do is so simple.
Reprint: http://www.ibm.com/developerworks/cn/java/l-javaio/
Improved I/O performance with buffer by extending the Randomaccessfile class-reproduced