Java NiO read large files by row __java read large files

Source: Internet
Author: User

Do the project process encountered to resolve more than 100 m TXT file, and warehousing. With the previous FileInputStream, BufferedReader obviously not, although readline this method can be read directly by row, but to read a 140M or so, 68W data files, not only time-consuming and will overflow memory, That is, you can't wait until you've read the 68W data, the memory overflows. So we have to use the relevant objects and methods below NiO.

Use a byte buffer (java.nio.ByteBuffer); a channel for reading, writing, mapping, and manipulating Files (Java.nio.channels.FileChannel); Set up a text note set ( Java.nio.charset.Charset); supports read and write to random access files (java.io.RandomAccessFile).

The idea is: set two buffers, a small, large buffer for each read amount, small buffer for each row of data (make sure the size can hold the longest line in the text). Read when the judge is not a line break 13, is the case to return a row of data, not the words continue to read until the file is finished.

Implementation method:

FileChannel Fc=raf.getchannel ();

Read the number of bytes cached at a time by reading the file
Bytebuffer fbb=bytebuffer.allocate (1024*5);
Fc.read (FBB);
Fbb.flip ();

Bytes cached per line according to your actual requirements

Bytebuffer bb=bytebuffer.allocate (500);


Decide whether to finish reading the file

public Boolean Hasnext () throws IOException {

if (EOF) return false;
if (Fbb.position () ==fbb.limit ()) {//Determine whether the current position is limited to the buffer
if (ReadByte () ==0) return false;
}
while (true) {
if (Fbb.position () ==fbb.limit ()) {
if (ReadByte () ==0) break;
}
byte A=fbb.get ();
if (a==13) {
if (Fbb.position () ==fbb.limit ()) {
if (ReadByte () ==0) break;
}
return true;
}else{
if (Bb.position () < Bb.limit ()) {
Bb.put (a);
}else {
if (ReadByte () ==0) break;
}
}
}
return true;
}

Private int ReadByte () throws ioexception{
        // Makes the buffer ready to reread the included data: it keeps the restrictions unchanged and sets the position to zero.
        fbb.rewind (); The
        //makes the buffer ready for a new sequential channel read or relative get operation: It sets the limit to its current position and then sets the location to zero.
        fbb.clear ();
        if (This.fc.read (FBB) ==-1) {
             EOF=true;
            return 0;
        }else{
             Fbb.flip ();
            return fbb.position ();
        }
    }

    public byte[] Next () {
        bb.flip ()

It is important here to return the byte array to facilitate, the row is split in the case of merging, otherwise, if the buffer limit is reached, a Chinese character is removed two bytes, it will display an abnormal
byte tm[] = Arrays.copyofrange (Bb.array (), Bb.position (), Bb.limit ());
Bb.clear ();
return TM;
}



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.