Linux/UNIX Advanced I/O

Source: Internet
Author: User
Tags flock
Advanced IO non-blocking IO allows us to call IO operations such as open, read, and write without permanent blocking. Advanced I/O non-blocking IO

Non-blocking I/O allows us to call I/O operations such as open, read, and write without permanent blocking. If this operation cannot be completed, an error is returned immediately after the call, indicating that the operation will be blocked if it continues to be executed.

There are two ways to specify a non-blocking I/O for a given descriptor:

1) If you call open to obtain the descriptor, you can specify the O_NONBLOCK flag.

2) for an opened descriptor, you can call the fcntl function to enable the O_NONBLOCK file status flag.

Record lock

Record lock: when a process is reading or modifying a part of a file, it can prevent other processes from modifying the same file area. The lock is a region in the file.

Record lock function fcntl:

# Include

# Include

Int fcntl (intfd, int cmd,.../* arg */);

For record locks, cmd is F_GETLK, F_SETLK, or F_SETLKW. The third parameter arg is a flock result Pointer. its structure is as follows:

Struct flock {

...

Short l_type;/* Type of lock: F_RDLCK,

F_WRLCK, F_UNLCK */

Short l_whence;/* How to interpret l_start:

SEEK_SET, SEEK_CUR, SEEK_END */

Off_t l_start;/* Starting offset for lock */

Off_t l_len;/* Number of bytes to lock */

Pid_t l_pid;/* PID of process blocking our lock

(F_GETLKonly )*/

...

};

Flock results are described as follows:

The expected type is determined by l_type. it can be F_RDLCK (shared lock), F_WRLCK (dedicated write lock), or F_UNLCK (Unlocking an area ).

The starting byte offset of the lock or unlock area, which is determined by l_whence and l_start.

Region bytes: Determined by l_len. If l_len is 0, it indicates that the lock area starts from the starting point until the maximum possible offset.

A lock that can block the current process. the ID of the process held by the lock is stored in l_pid (returned by F_GETLK)

Shared read lock and exclusive write lock: multiple processes can have a shared read lock on a given byte, but only one process can use one lock on a given byte.

The concepts of shared read locks and exclusive write locks are requests that apply to different processes. they do not apply to requests that require multiple locks. When a process has a lock on a file range, and then the process attempts to add a lock to the same file range, the new lock will replace the old lock.

When a read lock is applied, the descriptor must be read and write.

The following describes three commands for the fcntl function:

F_GETLK: determines whether the Lock described in arg will be excluded (blocked) by another lock ). If there is a lock that prevents the creation of the Lock described by arg, the information of the existing lock is written into the structure pointed by arg. If this condition is not found, the other information in the structure pointed to by arg remains unchanged except that l_type is set to F_UNLCK.

F_SETLK: Set the Lock described by arg. If you try to create a read or write lock, but the actual situation does not allow the establishment of a lock (such as a write lock already exists), fcntl returns an error.

F_SETLKW: it is the blocking version of F_SETLK.

Implicit inheritance and release of locks:

There are three rules for automatic inheritance and release of record locks:

1. the lock is related to two aspects of the process File: The first point is obvious. when a process terminates, all the locks it creates are released; the second point is to close a descriptor at any time, any lock on the file referenced by this file descriptor is released.

2. the child process generated by fork cannot inherit the lock set by the parent process. This means that if a process gets a lock and then calls fork, the sub-process is considered as another process for the lock obtained by the parent process. for any inherited descriptor, the sub-process must call fcntl to obtain its own lock. This is consistent with the lock function. the lock is used to prevent multiple processes from simultaneously writing the same file. If a child process inherits the lock of the parent process, the parent and child processes can write the same file at the same time.

3. after exec is executed, the new program can inherit the lock of the original execution program. The lock is released when exec is executed unless the close-on-exec flag is set with the file descriptor.

Lock data structure implementation

Consider a process that runs the following statements:

Fd1 = open (pathname ,... );

Write_lock (fd1, 0, SEEK_SET, 1 );

If (pid = fork ()> 0 ){

Fd2 = dup (fd1 );

Fd3 = open (pathname ,...);

} Else if (pid = 0 ){

Read_lock (fd1, 1, SEEK_SET, 1)

}

Write_lock and read_lock call the locking implementation of fcntl.

Displays the data structure after the parent and child processes are suspended:

With the record lock, the lockf structure is added to the original data structure. they are connected by I nodes. Note that each lock structure describes a lock area (with offset and length definitions) of a given process ). The figure shows two lockf structures. one is formed by the parent process calling write_lock, and the other is formed by the sub-process calling read_lock. Each structure contains the corresponding process ID. In the parent process, disabling any of fd1, fd2, and fd3 will release the write lock set by the parent process.

Locks and mandatory locks

The locks are defined as follows: every process that uses a lock file must check whether a lock exists. of course, you must respect the existing locks. In general, the kernel and system insist on not using the locks. they rely on programmers to comply with this rule. (Linux uses the unlock lock by default)

The forced lock is executed by the kernel. When a file is locked for write operations, the kernel will block any read or write access to the file before the process that locks the file releases the lock, check whether the lock exists for each read or write access.

Example:

Example 1: I have several processes (not necessarily related) that use the fctnl mechanism to operate files. this is called a consistent method.
However, if at the same time there is another rogue process, take care of it 3721, rush up, open, write. At this time, the several processes fcntl can't do anything about this method, so it is called inconsistency. The final State of the file is not fixed. Because this lock does not constrain other access methods, we recommend that you use the row lock. Mandatory locks must be supported by the kernel. read, write, and open locks are checked.

Example 2: The so-called locks are assumed that people will follow certain rules to do one thing. For example, when a person or a car sees a red light, it will stop, and the green light will continue. we can call it a suggested lock. However, this is only a rule. you do not prevent others from running red lights. The mandatory lock means you cannot throw a red light.

STREAMSI/O multi-channel transfer

I/O multiplexing: constructs a list of descriptors first, and then calls a function to know that one of these descriptors is ready for I/O, this function returns. At the time of return, it tells the process which descriptors are ready for I/O.

Poll, pselect, and select functions enable I/O multiplexing.

Select and pselect functions

# Include

Int select (intnfds, fd_set * readfds, fd_set * writefds,

Fd_set * exceptfds, structtimeval * timeout );

The select function allows us to perform multiple I/O transfers.

The last parameter, which specifies the time to wait:

Structtimeval {

Long TV _sec;/* seconds */

Long TV _usec;/* microseconds */

};

There are three scenarios:

Timeout = NULL: always wait. If a signal is captured, the waiting period is interrupted. Returns a signal that has been prepared or captured in the specified descriptor. If a signal is captured, select returns-1 and errno is set to EINTR.

Tvptr-> TV _sec = 0 & tvptr-> TV _usec = 0: Do not wait. Test all specified descriptors and return immediately. This is the polling method that gets the state of multiple descriptors without blocking the select function.

Tvptr-> TV _sec! = 0 | tvptr-> TV _usec! = 0: waiting for the specified number of seconds and number of milliseconds. Return immediately when one of the specified descriptors is ready or when the specified time value is exceeded. If no descriptor is ready during timeout, the return value is 0. As in the first case, such waiting signals can be captured.

The three parameters in the middle are readfds, writefds and writable TfDs, which are pointers to the descriptor set. These three descriptor sets illustrate the various descriptors that we care about, including readable, writable, and under exceptional conditions. Each descriptor set is stored in an fd_set data type. The fd_set data type is processed by allocating a variable of this type and assigning the variable of this type to another variable of the same type; or use one of the following four functions for this type of variable.

# Include

Void FD_CLR (int fd, fd_set * set); // clear a specified position

Int FD_ISSET (int fd, fd_set * set); // test whether to set the delimiter. If fd exists, return non-0

// Value; otherwise, 0 is returned.

Void FD_SET (int fd, fd_set * set); // you can specify and locate an fd_set variable.

Void FD_ZERO (fd_set * set); // sets all the bits of a specified fd_set variable to 0.

The test procedure is as follows:

#include 
 
  #include 
  
   #include 
   
    #include 
    
     #include 
     
      #include 
      
        int main(){ fd_set rfds; struct timeval tv; int retval; char buf[1024]; for(;;) { FD_ZERO(&rfds); FD_SET(STDIN_FILENO, &rfds); /* Wait up to five seconds. */ tv.tv_sec = 5; tv.tv_usec = 0; retval = select(1, &rfds, NULL, NULL, &tv); /* Don't rely on the value of tv now! */ if (retval) { printf("Data is availablenow.\n"); if(FD_ISSET(STDIN_FILENO, &rfds)) { read(STDIN_FILENO,buf,1024); printf("Read buf is:%s\n",buf); } } else printf("No data within five seconds.\n"); } exit(0);}
      
     
    
   
  
 

Execution result

Hello

Data is available now.

Read buf is: hello

No data within five seconds.

No data within five seconds.

World

Data is available now.

Read buf is: world

No data within five seconds.

Pselect has the same function as select, but its timeout value is specified in the timespec structure (in seconds and in seconds. You can select a blocked signal. The timeout value is declared as const.

# Include

Int pselect (int nfds, fd_set * readfds, fd_set * writefds, fd_set * limit TfDs,

Const struct timespec * timeout, const sigset_t * sigmask );

Poll function

Poll functions are similar to select functions, but their program interfaces are different.

# Include

Int poll (struct pollfd * fds, nfds_t nfds, int timeout );

Unlike the select function, poll constructs a pollfd result array instead of a descriptor set for each state. The structure is defined as follows:

Struct pollfd {

Int fd;/* file descriptor */

Short events;/* requested events */

Short revents;/* returned events */

};

The number of elements in the fds array is specified by nfds.

Timeout =-1: Always wait

Timeout = 0: Do not wait

Timeout> 0: Wait for timeout milliseconds

Readv and writev functions

These two functions are used to read and write multiple discontinuous buffers in the callback function call.

# Include

Ssize_t readv (int fd, const struct iovec * iov, int iovcnt );

Ssize_t writev (int fd, const struct iovec * iov, int iovcnt );

The second parameter of the two functions is a pointer to the iovec result array:

Struct iovec {

Void * iov_base;/* Starting address */

Size_t iov_len;/* Number of bytes to transfer */

};

The first element of the iovec structure points to the starting address of the buffer, and the second element specifies the length. The number of iovec array elements is specified by iovcnt.

Writev aggregates output data from the buffer in order of iovec [0], iovec [1], and iovec [2], and returns the total number of output bytes.

The same applies to readv.

Storage ing I/O

Storage ing I/O maps disk files to a buffer in the bucket. Therefore, retrieving data from the buffer is equivalent to reading the corresponding bytes in the file. Similarly, when data is stored in the buffer, the corresponding bytes are automatically written to the file. In this way, I/O can be executed without read and write.

To implement this function, the kernel should first be notified to map a given file to a storage area. This is implemented by the mmap function. Munmap can remove mappings.

# Include

Void * mmap (void * addr, size_t length, int prot, int flags,

Int fd, off_t offset );

Int munmap (void * addr, size_t length );

The addr parameter is used to specify the start address of the mapped storage area. It is usually set to 0, indicating that the starting address of the ing zone is selected by the system. The return address of this function is the actual address of the ing area.

Fd specifies the descriptor of the file to be mapped. Open the file before ing the file to an address space. Len is the number of ing bytes. Off is the starting offset of the ing byte in the file.

The prot parameter specifies the protection requirements for the ing zone. It can be PROT_NONE, or a bitwise or combination of PROT_READ, PROT_WRITE, and PROT_EXEC. The protection requirements for the specified mapped storage area cannot exceed the access permission in file open mode.

Flag: can be set

MAP_FIXED: the returned value must be addr.

MAP_SHARED: indicates the configuration of storage operations performed by the process on the ing area. Specifies the storage operation to modify the ing file.

MAP_PRIVATE: this flag indicates that a private copy of the ing file is created due to storage operations on the ing area. All subsequent references to the ing area reference the copy instead of the original file.

You can call mprotect to change the permissions of an existing mapped storage area.

# Include

Intmprotect (const void * addr, size_t len, int prot );

If the page in the shared storage ing area has been modified, you can call msync to fl the page to the mapped file.

# Include

Int msync (void * addr, size_t length, int flags );

The following program uses mmap as an example:

#include 
 
  #include 
  
   #include 
   
    #include 
    
     #include 
     
       int main(int argc, char *argv[]){   int         fdin, fdout;   void        *src, *dst;   struct stat statbuf;    if (argc != 3)       printf("usage: %s 
       
       
        ", argv[0]); if ((fdin = open(argv[1], O_RDONLY)) < 0) printf("can't open %s for reading", argv[1]); if ((fdout = open(argv[2], O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)) < 0) printf("can't creat %s for writing", argv[2]);if (fstat(fdin,&statbuf) < 0) /* need size ofinput file */ printf("fstat error"); /*set size of output file */ if (lseek(fdout, statbuf.st_size - 1, SEEK_SET) == -1) printf("lseek error"); if (write(fdout, "", 1) != 1) printf("write error"); if ((src = mmap(0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0)) == MAP_FAILED) printf("mmap error for input"); if ((dst = mmap(0, statbuf.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fdout, 0)) == MAP_FAILED) printf("mmap error for output"); memcpy(dst, src, statbuf.st_size); /* does the file copy */exit(0);}
       
      
     
    
   
  
 

This program has completed functions similar to cp.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.