Linux executable and write synchronization problems (file read and write operations generated by the lock mechanism) _linux

Source: Internet
Author: User
Tags goto
When an executable file is already open for write, the executable file at this time is not allowed to be executed. Conversely, when a file is executing, it is not allowed to be open at the same time in write mode. This constraint is well understood because file execution and file writes should be synchronized, so the kernel guarantees this synchronization. So how does the kernel implement that mechanism?
Inode node contains a data item, called I_writecount, is obviously used to record the number of files written, for synchronization, the type is also atomic_t. There are two functions in the kernel that we need to understand, related to write operations, respectively:

Copy Code code as follows:

int get_write_access (struct inode * inode)
{
Spin_lock (&inode->i_lock);
if (Atomic_read (&inode->i_writecount) < 0) {
Spin_unlock (&inode->i_lock);
Return-etxtbsy;
}
Atomic_inc (&inode->i_writecount);
Spin_unlock (&inode->i_lock);
return 0;
}

int deny_write_access (struct file * file)
{
struct Inode *inode = file->f_path.dentry->d_inode;
Spin_lock (&inode->i_lock);
if (Atomic_read (&inode->i_writecount) > 0) {//If the file is opened, return failed
Spin_unlock (&inode->i_lock);
Return-etxtbsy;
}
Atomic_dec (&inode->i_writecount);
Spin_unlock (&inode->i_lock);
}

These two functions are very simple, get_write_acess function is consistent with the name, the same deny_write_access also. If a file is executed, to ensure that it cannot be written during execution, you should call deny_write_access to turn off the Write permission before you begin execution. Then check to see if the EXECVE system call has done so.
Call Do_execve in Sys_execve, and then call function Open_exec, and look at the Open_exec code:

Copy Code code as follows:

struct file *open_exec (const char *name)
{
struct file *file;
int err;
File = Do_filp_open (AT_FDCWD, name,
O_largefile | o_rdonly | Fmode_exec, 0,
may_exec | May_open);

if (is_err (file))
Goto out;
err =-eacces;

if (! S_isreg (File->f_path.dentry->d_inode->i_mode))
Goto exit;

if (File->f_path.mnt->mnt_flags & mnt_noexec)
Goto exit;

Fsnotify_open (File->f_path.dentry);
Err = deny_write_access (file);//Call
if (ERR)
Goto exit;

Out
return file;

Exit
Fput (file);
Return Err_ptr (ERR);
}

Obviously saw the call of Deny_write_access, exactly the same as expected. In the call to open, there should be a get_write_access call. A call to the function is included in the __dentry_open function associated with the open call.

Copy Code code as follows:

if (F->f_mode & Fmode_write) {
Error = __get_file_write_access (Inode, MNT);
if (Error)
Goto Cleanup_file;
if (!special_file (Inode->i_mode))
File_take_write (f);
}

Wherein __get_file_write_access (Inode, MNT) encapsulates the get_write_access.
So how does the kernel guarantee that a file being written is not allowed to be executed? This is also very simple: when a file is already open for write, the i_writecount of its corresponding inode becomes 1, so the same will be invoked when the EXECVE is executed Deny_ After reading to i_writecount>0 in the write_access, it returns a failure, so execve will fail back.
Here is a i_writecount-related scenario for writing files:
When you write to open a file, in the function Dentry_open:
Copy Code code as follows:

if (F->f_mode & Fmode_write) {
Error = get_write_access (inode);
if (Error)
Goto Cleanup_file;
}

Of course, when the file closes, the i_writecount--is executed, and the code executes when it closes:
Copy Code code as follows:

if (File->f_mode & Fmode_write)
Put_write_access (Inode);

The Put_write_access code is simple:
Copy Code code as follows:

static inline void put_write_access (struct inode * inode)
{
Atomic_dec (&inode->i_writecount);
}

So I wrote a simple code, an empty loop, when the file was executed, in bash, the Echo >> executable, the result expected, the return failed, and prompted the message text file busy.
Does the mechanism also apply to the mapping mechanism, and when executing the executable, some associated dynamic-link libraries are mmap, and are these dynamic link libraries not allowed to be written after being mmap and not allowed to mmap when they are being written? This is to be considered because it concerns security. Because the library file is also executable code, tampering can also cause security problems.
Mmap in the call to the Mmap_region function, there is a related check:

Copy Code code as follows:

if (Vm_flags & Vm_denywrite) {
Error = deny_write_access (file);
if (Error)
Goto FREE_VMA;
Correct_wcount = 1;
}

Where the flags parameter in the MMAP call is assigned to Vm_flags, the corresponding relationship is Map_denywrire set, then Vm_denywrite is also set. Here's a simple code to do the test:
Copy Code code as follows:

#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
int main ()
{
int FD;
void *src = NULL;
FD = open ("Test.txt", o_rdonly);
if (FD!= 0)
{
if (src = mmap (0,5,prot_read|        Prot_exec, map_private| map_denywrite,fd,0)) = = map_failed)
{
printf ("MMAP error\n");
printf ("%s\n", Strerror (errno));
}else{
printf ("%x\n", SRC);
}
}

FILE * fd_t = fopen ("Test.txt", "w");
if (!fd_t)
{
printf ("Open for Write error\n");
printf ("%s\n", Strerror (errno));
return 0;
}

if (fwrite ("0000", sizeof (char), 4,fd_t)!= 4)
{
printf ("Fwrite error \ n");
}


Fclose (fd_t);
Close (FD);
return 1;
}

The final test.txt was written "0000", very strange, seemingly map_dentwrite did not work. So man Mmap looked and found:

Map_denywrite

This flag is ignored. (Long ago, it signaled that attempts to write to the underlying file should fail with Etxtbusy. But This is a source of denial-of-service attacks.)

It turns out that this identity is no longer working at the user level, but it also explains why it is easy to cause a denial-of-service attack. An attacker maliciously map_denywrite a file that some system program is writing to, causing the normal program to write a file that fails. However, Vm_denywrite is still used in the kernel, and there is a call to deny_write_access in Mmap, but the call to it is not driven by flag parameters in Mmap.
The dynamic-link library files associated with executables are tragic, and we all know that the dynamic link library is also mmap, which also causes the dynamic link library to be changed at run time. Actually, that's what I'm trying to confirm. This also causes me to write the synchronization control code myself. We can use the i_security in the inode and the file structure of the F_secutiry variable to write their own synchronization logic, is a lot of trouble, but also to write the kernel module, ah, the workload has increased AH. Security is a troublesome problem ...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.