Test __java for IO and nio to read and write text files

Source: Internet
Author: User
Tags garbage collection readline


In the Log Analysis service, you need to read and write a large number of text files and backup operations, the performance requirements are relatively high, so the IO and NiO file operating class to do test.

Io here refers to the Java "old" Io operation.

Any friend who knows NiO knows: NiO gets a filechannel by the Getchannel () method of the FileInputStream, FileOutputStream, or Randomaccessfile object. FileChannel with Bytebuffer to complete the operation. Bytebuffer is a byte buffer. As the name suggests, Bytebuffer is the byte that is cached in memory.

During the entire testing process, it was found that the bytebuffer of using NIO, both read and write files, is higher than the "old" IO performance, but because NiO operates in bytes, and the most used in our China is GBK or UTF-8 characters, there is a paradox. The heavy consumption of converting characters to bytes almost offsets the advantage of NiO for "old" IO.

Perhaps you will ask that NiO is not charbuffer, using NiO for character manipulation? I'm sorry to tell you that NIO provides a FileChannel class with parameters in the read and write methods that are bytebuffer without support for Charbuffer. This may also be a shortage of places.

As a result of the log Analysis service, the text file by line read ReadLine (), encapsulating the word throttling and NiO after the formation of ReadLine () is not faster than Bufferedread ReadLine.

There is also a more common speech: the time to read down, memory went up. The memory came down, and the reading time went up.

Summary: 1 If read-write is the bottleneck, read select NiO non-direct byte buffer, write select NiO's direct byte buffer. But pay attention to coding.

2 If you don't want to consider too many questions, use BufferedReader and BufferedWriter.

3 If the contents of the operation are all in English, then choose NiO.

Warm tip: As we all know, UTF-8 is a variable-length code, one character for 1-4 bytes is possible.

API content:

BufferedReader : reads text from the character input stream, buffering individual characters to provide efficient reading of characters, arrays, and rows. You can specify the size of the buffer, or you can use the default size. In most cases, the default value is large enough.

Typically, each read request made by reader causes the corresponding read request to be made to the underlying character or byte stream. Therefore, it is recommended to use BufferedReader to wrap all Reader (such as FileReader and InputStreamReader) whose read () operation may be expensive. For example

BufferedReader in = new BufferedReader (New FileReader ("foo.in"));

The input of the specified file is buffered. If there is no buffering, each call to read () or readLine () causes the byte to be read from the file and returned after it is converted to a character, which is extremely inefficient.

You can localize programs that use DataInputStream to enter into the original text by replacing each datainputstream with the appropriate bufferedreader.

-----------------------------------------------------

bufferedwriter : Writes text to the character output stream, buffering individual characters, providing efficient writes of individual characters, arrays, and strings. You can specify the size of the buffer or accept the default size. In most cases, the default value is large enough. This class provides a newline () method that uses the platform's own concept of row separators, defined by the system Properties Line.separator. Not all platforms use a new line character (' \ n ') to terminate each row. Calling this method to terminate each output row is therefore preferable to writing a new line character directly.

Typically, Writer sends its output immediately to the underlying character or byte stream. Unless you are prompted for the output, it is recommended that you use BufferedWriter to wrap all Writer (such as Filewriters and outputstreamwriters) that may be expensive for the write () operation. For example

PrintWriter out = new PrintWriter (new BufferedWriter) (New FileWriter ("Foo.out"));

The buffer printwriter the output to the file. If there is no buffering, each call to the print () method causes the character to be converted to bytes and then written to the file immediately, which is extremely inefficient

-----------------------------------------------

Bytebuffer: Buffer

Direct with non-direct buffer

The byte buffer is either direct or direct. If it is a direct byte buffer, the Java virtual opportunity does its best to perform native I/O operations directly on this buffer. That is, before (or after) each call to a native I/O operation of the underlying operating system, the virtual machine avoids copying the contents of the buffer into the intermediate buffer (or copying the contents from the buffer between them).

A direct byte buffer can be created by calling the Allocatedirect factory method of this class. The cost of allocating and unassign the buffer returned by this method is usually higher than the indirect buffer. The contents of the direct buffer can reside outside the regular garbage collection heap, so their impact on the memory requirements of the application may not be obvious. Therefore, it is recommended that direct buffers be allocated primarily to large, persistent buffers that are susceptible to native I/O operations on the underlying system. In general, it is best to assign them only if the direct buffer has obvious benefits in terms of program performance.

A direct byte buffer can also be created by mapping the file area directly into memory. The implementation of the Java platform helps create direct byte buffers from native code through JNI. If a buffer instance in these buffers refers to an inaccessible area of memory, attempting to access the zone does not change the contents of the buffer and will cause an indeterminate exception to be thrown during or at a later time.

Whether a byte buffer is a direct or indirect buffer can be determined by calling its Isdirect method. This method is provided for the ability to perform explicit buffer management in performance-critical code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.