NFS is an integral part of Distributed Computer systems. It can share and assemble remote file systems on Heterogeneous Networks.
1.1.1 NFS Overview
Developed by SUN, NFS has become a standard for file services (RFC1904, RFC1813 ). Its biggest function is to allow computers of different operating systems to share data over the network, so it can also be seen as a file server, as shown in Figure 1-1. NFS provides communication between Windows, Linux, and UNIX and Linux in addition to Samba.
Figure 1-1 NFS can be used as a file server
The client PC can mount the directory provided by the NFS server. After mounting, the directory looks like a local disk partition, you can use cp, cd, mv, rm, df, and other disk-related commands. NFS has its own protocol and port number, but when transmitting data or other related information, the NFS server uses a Remote Procedure Call (RPC) to assist the running of NFS servers.
1.1.2 why NFS is used
NFS aims to allow computers to share resources. In the course of its development (that is, in 1980s), the computer industry has developed rapidly. The low-cost CPU and client/server technologies have promoted the development of the distributed computing environment. However, when the processor price drops, the price of large-capacity storage systems remains high. Therefore, a certain mechanism must be used to make full use of the performance of a single processor while allowing computers to share storage resources and data. Therefore, NFS emerged.
1.1.3 NFS Protocol
With NFS, the client can transparently access the file system on the server, which is different from the FTP protocol that provides file transmission. FTP generates a complete copy of the file. NFS only accesses the part of the file referenced by one process, and the purpose is to make the access transparent. This means that any client program that can access a local file can access an NFS file without any modification.
NFS is a client/server application constructed using SunRPC. Its client sends an RPC request to an NFS server to access the files. Although this work can be implemented by a general user process, that is, the NFS client can be a user process that explicitly calls the server, and the server can also be a user process. NFS is generally not implemented in this way for two reasons. First, access to an NFS file must be transparent to the client. Therefore, NFS client calls are performed by the client operating system on behalf of the user process. Secondly, for efficiency considerations, the NFS server is implemented in the server operating system. If the NFS server is a user process, each client request and server response (including read and write data) will have to switch between the kernel and the user process, which is too costly. The NFS 3rd Protocol was released in 1993. Figure 1-2 shows a typical structure of an NFS client and an NFS server.
Figure 1-2 typical structure of the NFS client and NFS server
(1) access to a local file or an NFS file is transparent to the client. It is determined by the kernel when the file is opened. After the file is opened, the kernel passes all references to the local file to the "local file access" box, pass all references to an NFS file to the box named "NFS client.
(2) the NFS client sends an RPC request to the NFS server through its TCP/IP module. NFS mainly uses UDP, and the latest implementation can also use TCP.
(3) the NFS server receives client requests as UDP packets on port 2049, although NFS can be implemented as a port er, allowing the server to use a temporary port, however, most implementations directly specify UDP port 2049.
(4) When the NFS server receives a client request, it passes the request to the local file access routine and then accesses a local disk file on the server host.
(5) It takes a certain amount of time for the NFS server to process a client request. It usually takes some time to access the local file system. During this interval, the server should not block requests from other clients. To implement this function, most NFS servers are multi-threaded. In fact, multiple NFS servers are running in the NFS lock management program, the specific implementation depends on different operating systems. Since most UNIX kernels are not multithreading, a common technology is to start multiple instances of a user process (often called "nfsd. This instance executes a system call and keeps it as a kernel process in the kernel of the operating system.
(6) On the client host, it takes some time for the NFS client to process requests from a user process. The NFS client sends an RPC call to the server host and waits for the server to respond. To provide more concurrency for user processes on NFS client hosts, multiple NFS clients are generally run in the client kernel, and the specific implementation also depends on the operating system.
1.1.4 RPC
NFS supports many functions, and different functions are started using different programs. Each time you start a function, some ports are enabled to transmit data. Therefore, the ports corresponding to the NFS function are not fixed, instead, it uses random access to unused ports less than 724 for transmission. In this case, the client needs to connect to the server because the client needs to know the port of the server to connect to the server. In this case, we need to remotely call the (RPC) service. The main function of RPC is to specify the port number corresponding to each NFS function and return it to the client so that the client can connect to the correct port. When the server starts NFS, it selects several ports randomly and actively registers with RPC. Therefore, RPC can understand the NFS function of each port. Then RPC uses port 111 to listen to client requests and return the correct port of the client, which makes NFS startup easier. Note: Before starting NFS, you must start RPC first; otherwise, NFS cannot register with RPC. In addition, when RPC is restarted, the data originally registered will not be seen. Therefore, after RPC is restarted, all programs managed by RPC need to be restarted to register with RPC.
When the client has an NFS file to access the request, how does it request data from the server?
(1) the client sends an NFS file access request to the server RPC (port 111.
(2) After the server finds the corresponding registered NFS daemon port, it returns it to the client.
(3) After the client understands the correct port, it can directly connect to the NFS daemon.
Since all NFS functions must be registered with RPC, RPC can understand the port number, PID, and IP address of NFS functions on the host, the client can find the correct port through RPC inquiry. That is to say, NFS can successfully provide services only when RPC exists. Therefore, NFS is called an RPC Server. In fact, many such servers are registered with RPC. For example, NIS (Network Information Service) is also a type of RPC Server. As shown in figure 1-3, RPC must be started to use NFS on both the client and server.
Figure 1-3 correlation between NFS and RPC services and Operating Systems
The NFS protocol has been available in multiple versions since its birth, such as NFS V2 (rfc794) and NFS V3 (rfc1813) (the latest version is V4 (rfc307 )). At first, SUN designed NFS V2 to only use UDP, mainly because of the influence of the memory, network speed and CPU of the machines at that time, and had to choose a method with a lighter burden on the machines. In NFS V3, SUN chose TCP as the default transmission mode. The main differences between V3 and V2 are as follows.
(1) file size.
V2 supports up to 32-bit file sizes (4 GB), while V3 supports 64-bit file sizes.
(2) File Transfer size.
V3 does not have a limited transmission size. V2 can only be set to 8 KB at most. You can use-rsize and-wsize to set it.
(3) return complete information.
V3 adds and improves returned errors and successful information, which can bring great benefits to server settings and management.
(4) added support for TCP transmission protocol.
V2 only provides support for UDP, which is limited in some demanding network environments. V3 adds support for TCP. UDP features fast transmission speed and non-connection transmission convenience, but it is not stable over TCP during transmission. When the network is unstable or hackers intrude into the network, NFS performance can be greatly reduced or even paralyzed. Therefore, for different situations, the network must select a specific transmission protocol. The default NFS transmission protocol is UDP, but the RHEL 4.0 Kernel provides support for NFS over TCP. To use NFS over TCP, mounting the file system exported from NFS on the client system includes a "-o tcp" option. The advantages and disadvantages of using TCP are as follows.
-The connection persistence is improved, so the NFS stale file handles messages obtained are less.
-The performance of a network with a large load will be improved, because TCP confirms each group, while UDP only confirms when it is completed.
-TCP has the congestion control technology (UDP does not exist at all). In a network with severe congestion, UDP groups are the first type to be revoked. Using UDP means that if NFS is writing data (the unit is 8 KB), all the 8 KB data needs to be re-transmitted. Due to the reliability of TCP, only a portion of 8 KB data needs to be re-transmitted.
-Error detection. When the TCP connection is interrupted (because the server is stopped), the client stops sending data and starts to reconnect. UDP is connectionless, and the client that uses it will continue to send data to the network until the server goes online again.
-The cost of TCP is not significantly improved in terms of performance.
(5) asynchronous write feature.
(6) improved the server's mount performance.
(7) Better I/O write performance.
(8) enhanced network operation efficiency, making network operation more effective.
(9) Stronger disaster recovery functions.
In Linux, UDP is the default protocol. There is no choice as a server. However, as a client, you can use TCP to interconnect with other unix nfs servers that use TCP. It is better to use UDP in the LAN because the LAN has stable network guarantee. UDP can provide better performance. V2 is used by default in Linux, but nfsvers = n of mount option can also be used. NFS uses the protocols and services provided by TCP/IP to run on the application layer of the OSI layered model, as shown in the following table.
Table NFS on the OSI Hierarchical Model