Java NIO Reading Notes
The role of NIO is to improve program performance. Sometimes the performance bottleneck of the program is not the CPU, but the IO. At this time, NIO will be used. The principle of NIO is to use the underlying resources of the system to improve efficiency. For example, the DMA hardware is used to reduce the CPU load, And the epoll mechanism of the operating system is used to avoid frequent thread switching. Improve the system throughput through underlying resources.
A buffer zone is a set of data of a fixed size. The buffer zone has four important attributes: capacity, restriction, location, and tag. Capacity is the maximum number of elements in a buffer. The limit is the logical limit on the capacity. The location is used to track the location of the get or put method, mark is used for the reset function to return the last fixed position.
The put () method is used to store data into the buffer, and the get () method is used to read data from the buffer. Specific write operations include put (byte), put (index, byte), put (byte []), put (byte [], int start, int length ), data can be written to a single element or in batches. Read operations also have these four APIs.
Flip () is used to exchange unwritten and written data. That is, set limit to position and position to 0. Generally, after a group of data is saved, the data is flipped and then read from the original data.
Compact () compresses read data and copies unread data to a position with the Buffer Index No. 0. After copying, the original data will not be erased.
The mark () method is used to mark, and the reset () method is used to return the position of the last mark. Rewind (), clear (), and flip () will reset the flag, position (), limit (), depending on the situation. If it is smaller than the flag, the flag will also be reset.
The buffer zone can be compared. It must be of the same type. The comparison is based on the remaining content of the buffer zone, and has nothing to do with the tag, location, capacity, limit, and so on.
You can create a buffer in two ways. One is to create a new buffer, call xxBuffer. allocate, and the other is to encapsulate the existing array, and write data written in the buffer to the original array.
The buffer zone can be copied. Call duplicate (). The copied buffer is actually a view. The copied buffer zone has the same data as the original buffer zone, but each buffer zone has its own attributes, and its restrictions, locations, and tags are independent. You can also take a part of the buffer during replication and call slice ().
The buffer is also divided into big-endian and little-endian. Java. nio. ByteOrder can obtain the byte sequence of the local machine.
There is also a buffer called a direct buffer, which can be obtained through xxBuffer. allocateDirect. The direct buffer zone means the operation performance is higher than the normal buffer zone.
ByteBuffer provides asXXBuffer. For example, asw.buffer and asCharBuffer. These buffers are called view buffers. It is to provide the byte buffer to other programs with another behavior. ByteBuffer also provides methods such as getInt, getLong, and getDouble. These methods are called view operations, as if they are operating on another type of buffer. The same is true for write operations, as well as view operations. The view buffer is related to the view operation and the byte sequence. Therefore, you must set the byte sequence before the operation. The default value is BigEndian.
Java does not support unsigned data types. But there is always a solution. The following is a solution.
package com.ronsoft.books.nio.buffers;import java.nio.ByteBuffer;/** * Utility class to get and put unsigned values to a ByteBuffer object. * All methods here are static and take a ByteBuffer argument. * Since java does not provide unsigned primitive types, each unsigned * value read from the buffer is promoted up to the next bigger primitive * data type. getUnsignedByte() returns a short, getUnsignedShort() returns * an int and getUnsignedInt() returns a long. There is no getUnsignedLong() * since there is no primitive type to hold the value returned. If needed, * methods returning BigInteger could be implemented. * Likewise, the put methods take a value larger than the type they will * be assigning. putUnsignedByte takes a short argument, etc. * * @author Ron Hitchens (ron@ronsoft.com) */public class Unsigned{ public static short getUnsignedByte (ByteBuffer bb) { return ((short)(bb.get() & 0xff)); } public static void putUnsignedByte (ByteBuffer bb, int value) { bb.put ((byte)(value & 0xff)); } public static short getUnsignedByte (ByteBuffer bb, int position) { return ((short)(bb.get (position) & (short)0xff)); } public static void putUnsignedByte (ByteBuffer bb, int position, int value) { bb.put (position, (byte)(value & 0xff)); } // --------------------------------------------------------------- public static int getUnsignedShort (ByteBuffer bb) { return (bb.getShort() & 0xffff); } public static void putUnsignedShort (ByteBuffer bb, int value) { bb.putShort ((short)(value & 0xffff)); } public static int getUnsignedShort (ByteBuffer bb, int position) { return (bb.getShort (position) & 0xffff); } public static void putUnsignedShort (ByteBuffer bb, int position, int value) { bb.putShort (position, (short)(value & 0xffff)); } // --------------------------------------------------------------- public static long getUnsignedInt (ByteBuffer bb) { return ((long)bb.getInt() & 0xffffffffL); } public static void putUnsignedInt (ByteBuffer bb, long value) { bb.putInt ((int)(value & 0xffffffffL)); } public static long getUnsignedInt (ByteBuffer bb, int position) { return ((long)bb.getInt (position) & 0xffffffffL); } public static void putUnsignedInt (ByteBuffer bb, int position, long value) { bb.putInt (position, (int)(value & 0xffffffffL)); }}
There is also a ing buffer, which must be a direct buffer and can only be created by FileChannel.
The channel and buffer are different, and each operating system has different implementation methods. Therefore, the channel code is generally an interface or abstract class.
Channels are classified into blocking channels and non-blocking channels. Non-blocking channels cannot be used on file channels.
A channel is similar to a connection, so it cannot be used cyclically. The channel can be closed. You can use the close method and interrupt method to disable the channel by sending an interrupt signal. This design is quite awkward at the beginning, but it is designed to facilitate implementation in different operating systems.
The channel also supports batch writing or reading of multiple buffers. Generally, the operating system supports batch writing or reading buffers from the underlying layer. Therefore, Java translates batch operations into API calls at the underlying layer of the system to allow the operating system to complete batch operations, therefore, the speed is very fast.
The file channel can only be a blocking channel. Compared with FileStream, FileChannel also provides more operations, such as specifying a location to write data. FileStream or RandomAccessFile is required for file channel creation. The file channel status is consistent with the parameter status passed in during file channel creation, and the file location is synchronized. The file channel also provides force operations to write changes to files immediately. The file channel provides the truncate operation to set the file size.
There is a File Hole in the File system, that is, the File size is less than the occupied space. For example, if you write 10 K Data at 1 GB of a file, the actual space used by the file is 10 K rather than 1 GB.
A common misunderstanding of File locks is that each file can have only one file lock. Not every file channel object has one file lock, but not every thread has one file lock. Therefore, if two file channels are created for a file in the same JVM, the mutex lock will not be blocked. In other words, the file lock does not work within the JVM. Remember to release the file lock. It is best to put the released code in the finally block.
File ing buffer. This buffer is the same as a normal buffer, but the data content is stored on the disk. The ing buffer has three modes: Read-only, read-write, and private. In private mode, modifications to files are not written to files, but saved to the buffer zone. In private mode, the file content is synchronized with other common file channels. However, the Unit of synchronization is paging. That is to say, whether synchronization is performed in private mode depends on the paging size of the operating system. If you modify a file in private mode, the corresponding page is no longer synchronized with other file channels.
Channels can also be directly transmitted. The related methods are transferTo and transferFrom. Some operating system kernels support transmission between channels, so the performance is very high.
The load () method of file ing can load the entire file into the file cache of the operating system, and keep the file content synchronized with the disk.
Different socket channels and file channels support non-blocking mode. Each socket channel corresponds to a socket. This channel cannot be created from an existing socket.
The blockingLock () method returns an Object. You can use the synchronized keyword in Java to lock the Object and prevent other threads from modifying the Object. Socket channels include SocketChannel and ServerSocketChannel. ServerSocketChannel only provides non-blocking accept methods.
The datagram channel uses the UDP protocol for communication. Note: When receiving data, if the buffer capacity is insufficient, the extra data will be \ textbf {discarded. When sending data, if the buffer area is too large and exceeds the system's sending queue, no data will be sent. The datagram channel also has the connect method, which only specifies the sending object and is not a real connection.
Pipeline communication in PipeChannel and Unix is not the same concept. The pipeline channel in NIO can only communicate within one JVM, rather than inter-process communication. Inter-process communication can be performed through sockets. You can use Pipe. open () to create a channel, SinkChannel, and SourceChannel. SinkChannel is used for writing and SourceChannel is used for reading. Through pipelines, one thread can write data only, and the other thread only read data, which is somewhat similar to the generator object in Python. The biggest use of MPs queue channels is encapsulation. Encapsulate a file channel or socket channel into a pipeline channel to improve code reuse. After experiments, it is found that there is a buffer inside the pipeline. Even if the other side is not read, The write side can also write data larger than 1 K.
The specific implementation of the selector can only be done through the operating system, so the performance is relatively high.
The Selector contains three classes: Selector, SelectionKey, and SelectableChannel.
Selector is used to manage multiple optional channels and a bunch of selectionkeys. The select method will be blocked. Instead of returning the number of ready channels, it will be the number of ready channels in this call. SelectedKeys () returns a Set, but Set does not support multithreading. Therefore, if selectedKeys is placed in another thread iteration, ConcurrentModificationException may occur during the iteration.
There are three sets in Selector: registration set, selection set, and cancellation set. Selecting a set will only increase and will not decrease. To reduce the number, you need to manually delete it through the iterator. The corresponding SelectionKey will be deleted each time a request is processed.
The select mode can be either select or epoll. Select is the POSIX standard, and Epoll is unique to Linux. Select can only listen to up to 1024 channels, but Epoll does not. Select scans all channels for each call. Therefore, the more channels, the worse the performance. Epoll has an available queue maintained by the operating system kernel, when a channel is available, the operating system will add a channel to the queue, so the performance will not deteriorate as the number of channels increases.
Epoll has two working modes: Level Trigger and Edge Trigger. The default value is horizontal triggering. In this mode, when the data of the channel is not fully read, selectionKeys will immediately return the channel that has not been read after the next selection, while edge triggering will not, edge-triggered performance is higher, but the possibility of program errors is greater.
SelectionKey is the correspondence between the channel and the selector. Provides the readyOps () method, which returns operations that are ready for the channel. You can also use methods such as isWritable () and isReadable () to determine whether a channel supports an operation. These two methods are equivalent. The selection key can also contain an attachment for the channel to obtain parameters. Note that attachments should be cleared immediately if they are no longer used. Otherwise, memory leakage may occur.
SelectableChannel is an optional channel. It can be registered among multiple selectors. During registration, you must provide the events to be monitored, such as OP \ _ READ and OP \ _ WRITE. The validOps () method returns the operations that can be monitored by this channel. JDK defines four types of interests: read, write, connect, and accept. SocketChannel cannot accept connections, so validOps does not return the accept action. The registration channel can be registered again, but the second registration only modifies the interest set and returns the same SelectionKey. If the cancel () method has been called during the second registration, but the Selector has not been updated yet, CancelledKeyException will occur.
Closing the channel should be a very fast operation without any blocking. This is the design goal of JavaNIO. This design is called asynchronous shutdown.
Generally, the template is as follows:
While (true) {selector. select (); Iterator
Keys = selector. selectedKeys (); while (keys. hasNext () {SelectionKey key = keys. next (); // process the event... // delete the file after processing. This indicates that the event has been processed by keys. remove ();}}
For multi-core computers, only one thread is very inefficient at work. To improve performance on multi-core computers, multi-core threads and multiple selectors must be introduced. Each thread has a selector and is randomly allocated to a thread each time the connection is accepted. This is a method. Another method is that one thread is used to accept connections, and the other threads are dedicated to processing services.