Sendfile essence is an optimization technology in Linux system, in order to send files and network communication, reduce the user-state space and disk switching data, and directly at the core level of data copy, This technology is linux2.4, it is now very common in the C network-side server, and for Java, because Java is a high-level language in the advanced language, at least at the level of the C language can provide the interface sendfile, for example, in Java can be called by JNI C Library, which in Tomcat is actually the APR channel, through the tomcat-native to invoke similar to the APR library, although the call idea increases the Java call chain, but can be in the Java hierarchy, such as Sendfile of this Linux system-level optimization support, It's a lot more than one swoop. The above content, is actually the background of this article, this article from the system call level, gradually explain how the sendfile in Tomcat is implemented.
1. Introduction to Linux sendfile mechanism
Sendfile is a system call that can man a bit and see its function details:
ssize_t sendfile (int out_fd, int in_fd, off_t *offset, size_t count);
Sendfile () is an action function for copying data between two file descriptors. This copy operation is operating in the kernel, so it is called a "0 copy". The Sendfile function is much more efficient than the read and write functions, Because read and write are to copy data to the user application layer operation.
Parameter description:
OUT_FD is a file descriptor that has been opened for write operations (writes);
The IN_FD is a file descriptor that has been opened for read operations.
Offset offsets; Indicates the Sendfile function reads data from which offset in the in_fd. If it is zero, it reads from the beginning of the file, otherwise it is read from the corresponding cheap amount. If it is a cyclic read, The next offset value should be the value of the Sendfile function return value plus the offset for this time.
Count is the number of bytes copied between two descriptors (bytes)
return value:
If a successful copy returns a write operation to OUT_FD, the error returns 1, and the appropriate setting error information.
Eagain the Write operation (write) is blocked when the o_nonblock I/O setting is not blocked.
The EBADF output or the input file descriptor is not open.
Efault the wrong address.
The EINVAL descriptor is not available or locked, or the in_fd operation with the mmap () function is not available.
EIO An unknown error occurred while reading (read) in_fd.
Enomem reading (read) IN_FD memory is low.
To summarize, the actual sendfile is an efficient function for replacing write, and we look at the difference between the program and the normal network send system calling program:The traditional way of sending the code snippet is as follows:
fd = open (FILENAME, o_rdonly);
While (Len =read (fd, Buff, sizeof)) >0)
{
Send (SOCKFD, buff, len, 0);
}
Close (FD);And look at the use of Sendfile:use Sendfile () to transfer code snippets.
off_t offset = 0;
stat (FILENAME, &filestat);
fd = open (FILENAME, o_rdonly);
sendfile (SOCKFD, FD, &offset, filestat.st_size));
Close (FD);
The Sendfile call and the send call are very similar, can easily be replaced at the API level, the program almost no need to modify something, and replaced by advanced optimization calls, and then high concurrency scenario, it will be apparent from the data, performance improved a lot.
2.sendfile Optimization Effect
After introducing the system call, let's look at what steps the Sendfile optimization omitted,
The traditional network sends the request, certainly will walk the memory, because the memory may have some data to be processed, for example the data processing ah, the extraction and so on, but for the plain text character, for example the file this kind of, does not need to make the modification, sends directly, then actually did not need to walk the memory again, That is, the above box is part of the dashed line introduced directly to not go, directly from the disk to the kernel buffer, and then directly at the kernel level of the data flow to the network card buffer, and then directly sent by the media, visible, this is the effect of sendfile. SoSendfile Usage scenarios, we should also be very clear, for example, the Web server static resources, static files such HTTP requests, do not need to be processed in memory, Sendfile is the best choice.
3.sendfile logic for Defaultservlet
For the static resource processing in Tomcat, the direct counterpart is Defaultservlet, this class is embedded in the Tomcat source code, specifically dealing with static resources of the class, we look at its more critical doget (then called Servereource method) of the source code:
For the above code logic is, when the Checksendfile method is not true, stating that the request is a normal request, then according to this logic, the requested file needs to be entered into the Ostream stream, and finally the flow through the copy method, transferred to the OutputStream, Transmitted over the network. However, if the request is set in the Org.apache.tomcat.sendfile.supportset toboolean.true, which means that Sendfile is supported, then this property represents the file sent by the reqeuest request, which is sent by sendfile system call, rather than through the Send system call (the default Java network socket send stream, The actual JVM is called by the Send system call). for the above-mentioned Org.apache.tomcat.sendfile.supportproperty, which is equivalent to the request of every request can be set by the property, to tell the server, I am requesting to use sendfile, rather than send, this is equivalent to a very flexible. except Org.apache.tomcat.sendfile.supportproperties, through the analysis of the code, and several properties can also be set in the request, respectively:
-
- org.apache.tomcat.sendfile.filenameThe standard file name to send as a string.
-
- org.apache.tomcat.sendfile.startStart position offset value, long integer value.
-
- org.apache.tomcat.sendfile.endEnd position offset value, long integer value.
of course, if you do not set, then the default file name is the request requested files, start is 0,end is the length, from the above code can also be seen.
We look back, why these parameters need to be set, corresponding to the sendfile of several parameters can be understood. If it isThe Checksendfile method is true, so there is no transfer of the stream in the Defaultservlet, which is in the different xxxendpoint classes in the Tomcat front-end, keep looking down.
It is important to note that the general HTTP response of the packet will be compressed, the benefit is to greatly reduce the bandwidth consumption, and the response header found compression compression properties, the browser will be automatically first decompressed, so that the correct response response body to the page. However,when the Sendfile property is turned on, theThe compression compression attribute does not take effect, so when the file needs to be transferred is very large, and the network bandwidth is the bottleneck, Sendfile is obviously not the right move.
4.sendfile implementation in the Bio channel
In the case of Tomcat8, the Java wrapper for the Sendfile in different tomcat front-end channels is different, but it is actually called Sendfile in the calling system.
For, bio, Jioendpoint does not support Sendfile, which can be seen in the code:
5.implementation of sendfile in the NIO channel
In the NIO channel, there is aUsesendfile Property,what does this usesendfile property do? This can be set in connector, with the NIO channel as an example, configured as:
This Usesendfile property is a general switch that allows request to be sendfile (previouslyOrg.apache.tomcat.sendfile.supportproperty is for each request.), thisThe Usesendfile property is open by default in the NIO channel when Reqeust is setOrg.apache.tomcat.sendfile.supportwhen the property is true, response prepares a sendfiledata data structure, which is the sendfile medium under the NIO channel:
This data structure is used to pass to the Sendfile system call for sending. Therefore, the sendfile implementation of NIO can be divided into three stages: the first stage, which is actually the xxxdefaultservlet in front (not just defaultservlet, Any other servlet setting this property can also call Sendfile) to set the Sendfile property of the request, and when the requested property is set, it proves that the request is a sendfile request. In the second stage, after the servlet has been processed, the business logic is completed and the corresponding response commits, and in the preparation phase of the response, it initializes theSendfiledata's data structure, this piece of code logicare in the Http11nioprocessor class.:From the code logic above, the Preparesendfile method is to get the file name, the start,end of the character position from the Reqeust property set in the previous Defaultservlet, and then initialize the properties as passed parameters.Sendfiledata instance;In the third phase, we remember the acceptor,poller thread of the NIO front-end channel, the three threads of the worker thread, and when the worker thread has finished working, it returns to the client, still through the poller thread, which will re-register KeyEvent, Read KeyAttachment, this time when for Sendfile, the front initialization of theSendfiledata instances are registered in theKeyon the attachment:
The above processsendfile is a branch of judgment in the run of the Poller thread, and when it is Sendfile, the poller threadThe file name in the Sendfiledata data structure is taken out, through the FileChannel Transferto method. for thisTransferto method, we can see one of the important explanations:
The above explanation is the Sendfile system call.
6.implementation of sendfile in Apr channel
Sendfile implementations in the NIO channel are more complex, more complex in the APR channel, we can go back to see the sendfile in the NIO channel, actually through each poller thread in theFileChannel The Transferto method to achieve, forThe Transferto method is blocked, which means that when the file is Sendfile, the Poller thread is blocked, and we have previously studied the Tomcat front end, Poller thread is very precious, not only for some sendfile service, This can cause bottlenecks in poller threads, slowing the efficiency of the entire tomcat front end. for APR, based on the above, the following configuration can be discerned:
Usesendfile property Nothing to say, is the global sendfile switch, sendfilethreadcount corresponds to the APR channel, the function of sendfile from the poller thread to separate,
This is equivalent to the SENDFILEDATA data structure, which is added directly to the Sendfile thread:
The advantages of self-evident, poller on the dry poller, and meet the needs of Sendfile, Sendfile thread to stand up, the live to connect; Finally, the APR channel is the APR library called by JNI, Sendfile nature is not the Java API:
Summary:
Sendfile is actually an optimization of the operating system, and Tomcat is based on
there are different implementations in different channels, and the configuration is not the same, but it is actually called
Sendfile System calls of the operating system!
Reference: http://www.linuxjournal.com/article/6345?page=0,0
From for notes (Wiz)
Sendfile Support in E.tomcat