Apue Learning: Advanced I/O

Source: Internet
Author: User
Tags flock

Non-blocking I/O

nonblocking I/O is not blocking I/O, that is, I am requesting an I/O operation, and if the I/O operation does not block when it is not completed, it returns an error code immediately.
Specifies that the file descriptor is a nonblocking I/O method:
1. Use o_nonblocking when calling Open
2. For a file descriptor that has already been opened, we can use FCNTL to set the O_nonblocing property

Record lock

Record locks, which can lock a section of a file

Fcntl function
int fcntl(intintstruct flock *flocptr);struct flock{    short/* F_RDLCK, F_WR_LCK, F_UNLCK */    short l_whence;    off_t l_start;    off_t l_len;    pid_t l_pid;};

Description:
You can use the FCNTL function to complete the record lock function:
CMD parameter:
F_GETLK,
F_SETLK,
F_setlkw
The third parameter is a pointer to the flock structure structure

L_type can only be:
F_rdlck: Shared read lock
F_WRLCK: Exclusive Write lock
F_unlck: Unlocking an area

The starting position of the specified area is determined by L_start and l_whence.
L_whence:seek_set, Seek_cur, Seek_end
L_start: Is the offset on the basis of l_whence
L_en: Is the length of the area to be locked

L_pid: The current process of acquiring a lock, setting this value when F_GETLK is set. So if our process wants to get a lock, then we can only be suspended if there are other processes that have been locked.

Note the start position of the lock can be longer than the end of the file, but it can be no more than the file's start position.

Attention:

    • The difference between a read lock and a write lock control area. Note that for a process it can only lock a region of a file, and it cannot lock multiple areas.
      And, to get a read lock, the file must be read-open, and if you want to get a write lock, then the file must be write-open.

    • Record locks are related to processes and files, with 3 features:
      1. When a process is closed, its lock is also freed, and if a file descriptor is closed, all the locks on the file are freed.
      2. After the fork, the child process is not a lock that inherits the parent process
      3. After exec, the new program inherits the lock of the original program, but we can add close-on-exec in the flags of the file description.

    • Note the locking problem at the end of a file:
      The kernel shifts the relative displacement to the absolute offset of a file. So if we lock at the end of the file and then write, and then unlock the end of the file, the end of this time is not the end of our beginning, because we have written something.
      The reason for this problem is that we cannot get the absolute offset of the file, because after we call Lseek, other processes may write something that causes a change in the length of the file.

cmd parameter
F_GETLK: Test If we can get the lock
F_SETLK: Set our lock on the file, but if our set lock conflicts with another process's lock, then FCNTL will return immediately and set errno to Eacces or eagain and clear our locks.
Blocked version of F_setlkw:f_setlk. That is, if we are not able to acquire a lock, we will be suspended, knowing that another process has freed the lock to wake us up.

I/O multiplexing overview

Solution for resolving processes to read multiple files:
1. Generate sub-processes to read.
2. Using polling technology, in a single process, we can set the file descriptor to nonblocking, and then sequentially send a read call to the file: If this file is not prepared, it will return immediately, because it is nonblocking, and then send the next file read , so loop.
But doing so creates a waste of CPU resources.
3. Using asynchronous I/O, we can let the kernel send us a signal after the file has prepared the data.
But asynchronous I/O is not supported by every system, and it is not known which file is ready. (If you use a signal for a file, the signal is far from enough). So in order to decide which descriptor is ready, we still have to set each file descriptor to nonblocking and ask

I/O multiplexing technology:
We can construct a linked list, which records the file descriptors that we are interested in, and then calls a function that will not be returned until the file descriptor we are interested in is ready, and will tell us which file descriptor is ready when we return.

Select
int select(int maxfdp1, fd_set *restrict readfds,           fd_set *restrict writefds,           fd_set *restrict exceptfds,           struct timeval *restrict tvptr);

Description:
1.MAXFDP1: Refers to the maximum value of the file descriptor that we are interested in +1. For example, the maximum value of the file descriptor that we are interested in is N, then Maxfdp1 = n+1.
Note that the maximum number of file descriptors in a system is limited and can be

Correlation functions for file descriptor sets

Select return value

return value for select:
1.-1 indicates an error
2.0 means no files are ready.
3. Positive value indicates that the file is ready.

The meaning of readiness:
1. Read ready: Indicates that we call read on this file will not block
2. Write ready: Indicates that we call write on this file will not block
3. Unexpected readiness: Indicates that an accident was generated and queued on the corresponding file descriptor.
4. For a file descriptor that points to a normal file, he is always ready read, write and exception conditions

Note:
Note A file that describes the blocking and non-blocking flags does not affect the blocking condition of select. For example, we read on a non-blocking file descriptor, but we set the wait time for select to 5, then select Waits 5 seconds.

Pselect
int pselect(int maxfdp1, fd_set *restrict readfds,            fd_set *restrict writefds,            fd_set *restrict exceptfds,            conststruct timespec *restrict tsptr,            const sigset_t *restrict sigmask);

Pselect is mainly more than a sigmask, signal shielding word, that is, Pselect will be atomic installation of a signal shielding word, and save the previous signal shielding words.

Poll
int poll(structint timeout);struct pollfd{    int//需要检查的文件描述符    short//对于fd,我们感兴趣的事件    short//返回到的结果,即发生在fd上面的事件,由内核设置};

A fdarray[] array to specify the file descriptor that we are interested in, the length of the array is Nfds, and the wait time is timeout

description of events and revents:

asynchronous I/O Overview:

asynchronous I/O: Primarily, a process can initiate multiple I/O operations without blocking or waiting for any operation to complete
Synchronous I/O: means that we need to wait for the I/O operation to return for next steps.

AIO control block

POSIX is done by AIO, this is the AIO control block
An AIO control block controls one type of I/O operation

AIO_SIGEVENT specifies an asynchronous event that defines the notification signal or callback function when the asynchronous operation completes. Content such as:

    • Sigev_notify defines the type of notification, with 3 types:
      1.sigev_none: When asynchronous I/O is complete, the process is not notified
      2.sigev_signal: When the asynchronous operation is complete, a signal is sent, and the value of the signal is specified by Sigev_signo
      3.sigev_thread: When the asynchronous operation completes, the function specified sigev_notify_function is executed, and the Sigval value of the function is specified by Sigev_value. This function is executed in a separate thread,
AIO API
int aio_read(struct aiocb *aiocb);int aio_write(struct aiocb *aiocb);

AIO interface function, read, write. Once called and returned successfully, the operating system will help us accept the following things.

int aio_sync(intstruct aiocb *aiocb);

This function is for mandatory write to disk.
np[
O_dsync: Equivalent to Fdatasync
O_sync: Equivalent to Fsync

int aio_error(conststruct aiocb *aiocb);

This function is to get Aio_read/write/sync
The result:
0: Indicates that the AIO API call returned successfully
-1: Indicates that the call to Aio_error failed and the reason for the failure is written in errno
Einprogress: Indicates that Aio_read/write/sync is still in line
Other value: Indicates why the Aio_read/write/sync operation failed

ssize_t aio_return(conststruct aiocb *aiocb);

When the call Aio_errno returns successfully, we can call this function to get the relevant information.
If we call this function before the asynchronous API succeeds, there will be undefined results.
It is important to note that for each asynchronous I/O operation, we can only call once Aio_return, and once we call this function, the operating system releases the associated resource (that is, the associated return information).
If Aio_return itself fails, it returns-1. Other cases return the result information for an asynchronous operation, such as read, which returns how many bytes were read, and write returns how many bytes were written.

int aio_suspend(conststruct aiocb *constlist[],                int nent,                conststruct timespect *timeout);

Aio_suspend: We can call this function if the process we want to do asynchronously is done and we just want to wait for the I/O operation to complete. This one
The function is returned in 3 cases:
The 1.aio_suspend is interrupted by a signal, which returns 1, and sets the errno to Eintr
2. If the time is up, but no one I/O operation is ready, then return-1 and set errno to Eagain
3. If all of the file descriptors we are interested in are ready before we call this function, then there will be no blocking return
4. If an I/O is ready, then 0 is returned.

int aio_cancel(intstruct aio *aiocb);

This function cancels the asynchronous I/O operation on the FD, but is not guaranteed to be canceled. If AIOCB is null, then that means I want to cancel all asynchronous I/O operations above the FD.
return value:
Aio_alldone: All operations are done before the cancellation.
Aii_canceled: All cancellation requests are complete.
Aio_notcanceled: At least one request has not been canceled, Aio_cancel fails, and the failure code is written to errno.

int lio_listio(int mode,               struct aiocb *restrictconstlist[retrict],               int nent,               struct sigevent *retsrict sigeb);

Mode: Indicates that the I/O is not really asynchronous. If it is lio_wait, then the Lio_listio function will not return until all the I/O operations specified by the list have been completed.
If it is lio_wait, then this function returns immediately after all I/O requests are queued. The process receives a signal after all the I/O operations that are specified by the list, which is specified by Sigev. Note that there will also be a Sigev in the AIO control blocks, so the process will receive a corresponding signal after a single asynchronous operation is completed.

The aio_lio_opcode indicates the type of this I/O operation:
Lio_read:read
Lio_write:write
Lio_nop:no-op, would be ignored

READ: This means that our associated AIO control block will be passed to the Aio_read function.

Readv/writev
ssize_t readv(int fd,              structint iovcnt);ssize_t writev(int fd,               structint iovcnt);struct iovec{    void *iov_base;    size_t iov_len;};

Readv and Writev allow us to read and write from a continuous buffers, and are implemented in a function call.
Iov_base: Specifies the start address of the buffer
Iov_len: Specifies the size of the buffer

Description:
The Writev function collects data from iov[0] to iov[iovcnt-1] in turn, and then writes back in turn.
The READV function collects data into the buffers, noting that the buffer data in the iov[i-1] is filled in before the data is filled into the next iov[i].
If Readv returns 0 for no data and has encountered EOF

Readn/writen

There are some features of pipes,fifos and networks reading and writing:
The 1.read function does not necessarily return the number we need to read, even if we do not encounter EOF
The 2.write function does not necessarily return the number we need to write.
Note that this is not an error, we need to restart the read and write functions, and then continue to read and write.

However, for disk files These do not occur, and if so, an error has occurred.

ssize_t readn(intvoid *buf, size_t nbytes);ssize_t writen(intvoid *buf, size_t nbytes);

Readn and writen are both read and write multiple versions of the call.
Because we specify how many characters we want to read or write, the READN and writen will be read/written to a specified number of characters before they are returned.

Overview of I/O mappings:

Refers to mapping a file into a memory area so that our read and write operations on that memory area are equivalent to reading and writing to the file. The Read/write system call is omitted.

Mmap
void *mmap(voidint prot,           intint fd, off_t off);

Description

    • Addr: Represents the starting position of the memory we wish to map to, and Null means that we find the starting position using the system.
    • Prot: Indicates the level of protection for the mapped area has 4 PROT_READ:PROT_WRITE:PROT_EXEC:PROT_NON:
    • FD: File descriptor, this file must be opened before we can map it.
    • Len: The size of the content we want to map
    • OFF: We want the starting address of the area of the mapped file, and off is the offset from the start of the file.
    • Flags:map_fixed: The return value of the function referred to must be equal to addr. This parameter is not recommended.
      Map_shard: Refers to the operation of our map area is equivalent to our operation of the file. Map_private: For a map region
      Store operation, which causes a private copy of the mapped file. Our operation on the map area corresponds to the operation of this replica, and the original file does not change.

Attention:

    • Note that the memory-mapped unit of a Linux system is a page, so if our file size is less than one page, then the size of the page is mapped, but the extra memory unit is set to 0, and the operation for the memory unit that is out of that part has no effect on the original file. This means that we can increase the file size by mmap.
    • There are two types of signals associated with mappings:
      SIGSEGV: It means we want to touch a memory area that doesn't belong to us, and if we want to store the Read-only area, it can also cause SIGSEGV signals.
      Sigbus: If we want to touch an area of memory that is already invalid, it will cause this signal. For example, we mapped a file length, but when referencing the memory area, another process shortened the file, and the file length was changed. If we are going to operate on that part of the Intercept, we will receive a sigbus signal.

The child process after the fork inherits the mapped memory area, but the exec is lost.

Mprotect
int mprotec(voidint prot);

The Prot property that can be used to change the mapped memory area.
Note Addr may need to be an integer multiple of page size.

Msync
int msync(voidint flags);

Msync: Write back
Flags parameter: Can be used to indicate how we flush memory.
Ms_async: It just means we want to write Hewlett Packard. i.e. asynchronous or deferred write
Ms_sync: synchronous writeback, which means we have to wait until the disk is actually written back.

Munmap
int munmap(void *addr, size_t len);

Munmap Cancel the memory map.
Munmap does not affect our mapped files, which means that Munmap does not write the mapped memory area back to the disk file.
Writing back to the disk file will only happen when we use map_shared to store this memory or use Msync

Attention:
    • Storing mapped I/O is less work than read/write.
      Read will copy the data from the kernel buffer into the application's buffer. The write will then copy the application's buffer to the kernel buffer.
      While store-mapped I/O maps the kernel buffer's data into memory, and then writes, for example, to write to another file, we only use this part of the memory area of the store to another file mapped to the memory area.
    • Using storage-mapped I/O is more efficient for copying ordinary files, but there are some limitations, such as the ability to use this technology on some devices such as network devices, and we must pay attention to the problem of file size changes.

Apue Learning: Advanced I/O

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.