Linux -- function hijacking -- Based on LD_PRELOAD

Source: Internet
Author: User

Recently I am facing a problem, how to differentiate a problem of library-function from application problems. for solving this problem, we need to know some knowledge about share-library and basics in Linux. for dynamic libraries, they are loaded to memory at program running. there are deleting benefits for this. on the other hand, we have a chance to do something. for example: replace the function, which will be loaded, as our function. usually, this is maybe a very professional problem, at least for me. but thanks for some masters, they have been provide a mature way to our goal.
1). How to replace a function of shared libraries?
The answer is LD_PRELOAD, this is a environment variable for GUN-Linker.It is used to indicated some pre-load shared libraries. This meaning that functions in

This libraries will get a higher priority than normal libraries.

Normally, we use this technique just want to intercept some functions. So we can do some other thing (edevil thing ?)
Above the original function.
1). How can I get the original function?
Some functions: dlopen (), dlsym (), dlclose () and dlerror (). There have been contains articles to explain them.

In the next section, I will try to translate a nice article. The original article is there

Http://fluxius.handgrep.se/2011/10/31/the-magic-of-ld_preload-for-userland-rootkits/

The Directory of this article is as follows:
1. Shared Library
2. Simple LD_PRELOAD (relative to the processed LD_PRELOAD)
2.1. Shared libraries created and used
2.2. dlsym ()
2.3. Restrictions

Iii. Related Concealment Technologies

3.1 Jynx-Kit

3.2 Check Methods

The following is the text:

1. Shared Library
As we know, the link to the dynamic library is implemented during program loading. On my computer this feature is implemented by the ld-linux-x86-64.so.X, but for other models it may also be a ld-linux.so.X. If you are interested, you can verify

fluxiux@handgrep:~$ readelf -l /bin/ls [...]  INTERP         0x0000000000000248 0x00000000004purposes00248 0x0000000000400248                  0x000000000000001c 0x000000000000001c  R      1       [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] [...]

(PS: readelf is a tool used to help read elf files. A dynamic loading tool is usually specified for program files to be dynamically loaded. This involves parsing the elf format. In the next article, we will try to perform some simple analysis on the elf format .)

Dynamic compilation is much smaller than static compilation. For some of the database functions, only one pointer pointing to the relevant database is retained and there is no entity containing the function. If you want to view the calls of a program containing those libraries, you can run the "ldd" command to check the results. For example:

<a target=_blank href="mailto:fluxiux@handgrep:~$">fluxiux@handgrep:~$</a> ldd /bin/ls     linux-vdso.so.1 =>  (0x00007fff0bb9a000)     libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f7842edc000)     librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7842cd4000)     libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x00007f7842acb000)     libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7842737000)     libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7842533000)     /lib64/ld-linux-x86-64.so.2 (0x00007f7843121000)     libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7842314000)     libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007f784210f000)

Let's look at an example (the example program is as follows)

#include <stdio.h> main() {         printf("huhu la charrue"); }

Both dynamic and static

fluxiux@handgrep:~$ gcc toto.c -o toto-dyn fluxiux@handgrep:~$ gcc -static toto.c -o toto-stat fluxiux@handgrep:~$ ls -l | grep "toto-" -rwxr-xr-x  1 fluxiux fluxiux     8426 2011-10-28 23:21 toto-dyn -rwxr-xr-x  1 fluxiux fluxiux   804327 2011-10-28 23:21 toto-stat

We can see that "toto-stat" is almost 96 times that of "toto-dyn". Why?

fluxiux@handgrep:~$ ldd toto-stat     is not a dynamic executable

(Because "toto-stat" is static)
Dynamic Link is a great method that provides us with many benefits, such:
■ Update libraries and still support programs that want to use older, non-backward-compatible versions of those libraries,
■ Override specific libraries or even specific functions in a library when executing a special program,
■ Do all this while programs are running using existing libraries.
We are used to naming shared libraries. If a library is named "soname", a prefix "lib", a suffix ". so", and a version number are usually added.
(PS: although it is only agreed: if you do not do this, your library cannot be correctly identified)
Now let's take a look at LD_PRELOAD,

Ii. Simple LD_PRELOAD Application

We know that library files are generally stored in the "/lib" directory. Therefore, if you want to modify a library, the easiest way to think of it is to find the source code of the library and re-compile it after modification. But in addition to this solution, we also have another cool method, that is, using an external interface provided to us by Linux: LD_PRELOAD. (Let's take a look at the following)

2.1. Shared libraries created and used
If you want to rewrite the "printf" behavior, you can write your "printf" function first.

        #define _GNU_SOURCE        #include <stdio.h>        int printf(const char *format, ...)        {              exit(153);        }

Compile and compile it into a shared library. Like this,

        fluxiux@handgrep:~$ gcc -Wall -fPIC -c -o my_printf.o my_printf.c         my_printf.c: In function ‘printf’:        my_printf.c:6:2: warning: implicit declaration of function ‘exit’        my_printf.c:6:2: warning: incompatible implicit declaration of built-in function ‘exit’        fluxiux@handgrep:~$ gcc -shared -fPIC -Wl,-soname -Wl,libmy_printf.so -o libmy_printf.so  my_printf.o

Then we modify an environment variable and run our testing program.

        fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so        fluxiux@handgrep:~$ ./toto-dyn

You will see that the "printf" behavior has been changed, and it does not print "huhu la charrue ". Okay. Let's see what ltrace says.

        fluxiux@handgrep:~$ ltrace ./toto-dyn         __libc_start_main(0x4015f4, 1, 0x7fffa88d0908, 0x402530, 0x4025c0 <unfinished ...>        printf("huhu la charrue" <unfinished ...>        +++ exited (status 153) +++

Funny, our "printf" was called before the system's "printf. Now let's look at a new problem. If our goal is to simply modify the "printf" behavior, but not destroy the original function. What should I do? Rewrite the entire function ?!! This is obviously not suitable. To solve this problem, you can look at the following functions.

2.2. dlsym ()
There are several interesting functions in the library "libdl ".
Dlopen (): load a library
Dlsym (): gets a pointer to a specific symbol
Dlclose (): detach a database


Here, because the library has been loaded when the program is loaded, we only need to directly call "dlsym. We pass the "RTLD_NEXT" parameter to "dlsym" to obtain the pointer to the original "printf" function. Just like this

[...]typeof(printf) *old_printf;[...] //DO HERE SOMETHING VERY EVILold_printf = dlsym(RTLD_NEXT, "printf");[...]

Then we need to perform some special processing on the formatted string (corresponding to common parameters do not need to be so troublesome). After processing, we can use it directly.

        #include <stdio.h>        #include <dlfcn.h>        #include <stdlib.h>        #include <stdarg.h>        int printf(const char *format, ...)        {                  va_list list;                  char *parg;                  typeof(printf) *old_printf;                  // format variable arguments                  va_start(list, format);                  vasprintf(&parg, format, list);                  va_end(list);                  //DO HERE SOMETHING VERY EVIL                  // get a pointer to the function "printf"                  old_printf = dlsym(RTLD_NEXT, "printf");                  (*old_printf)("%s", parg); // and we call the function with previous arguments                  free(parg);        }

Compile again

        fluxiux@handgrep:~$ gcc -Wall -fPIC -c -o my_printf.o my_printf.c         my_printf.c: In function ‘printf’:        my_printf.c:21:1: warning: control reaches end of non-void function        fluxiux@handgrep:~$ gcc -shared -fPIC -Wl,-soname -Wl libmy_printf.so -ldl -o libmy_printf.so my_printf.o 

Try again

        fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so        fluxiux@handgrep:~$ ./toto-dyn         huhu la charrue

In this way, something that the user who calls "printf" does not want will happen quietly. However, this mechanism also has some defects.

2.3. Restrictions
Although this method is cool, there are some restrictions. For example, static compilation programs are invalid. Because the statically compiled program does not need to connect to the functions of the dynamic library. In addition, if the SUID or SGID bit of the file is set to 1, LD_PRELOAD will be ignored during loading (this is done by ld developers for security considerations ).

Iii. Related Concealment Technologies

3.1 Jynx-Kit

About two weeks ago, we introduced a new hiding technology. This tool uses an automatic shell script and has never been detected by rkhunter and chkrootkit. Let's take a look at the actual code. In "ld_python.c", 14 functions are hijacked.

[...]    old_fxstat = dlsym(RTLD_NEXT, "__fxstat");    old_fxstat64 = dlsym(RTLD_NEXT, "__fxstat64");    old_lxstat = dlsym(RTLD_NEXT, "__lxstat");    old_lxstat64 = dlsym(RTLD_NEXT, "__lxstat64");    old_open = dlsym(RTLD_NEXT,"open");    old_rmdir = dlsym(RTLD_NEXT,"rmdir");    old_unlink = dlsym(RTLD_NEXT,"unlink");     old_unlinkat = dlsym(RTLD_NEXT,"unlinkat");    old_xstat = dlsym(RTLD_NEXT, "__xstat");    old_xstat64 = dlsym(RTLD_NEXT, "__xstat64");    old_fdopendir = dlsym(RTLD_NEXT, "fdopendir");    old_opendir = dlsym(RTLD_NEXT, "opendir");    old_readdir = dlsym(RTLD_NEXT, "readdir");    old_readdir64 = dlsym(RTLD_NEXT, "readdir64");[...]

We can call an "open" function to see if "_ xstat" is called internally.

[...]    struct stat s_fstat;[...]    old_xstat(_STAT_VER, pathname, &s_fstat);[...]

Next is a check operation that checks the file group ID, path, and file name. (Make sure this file is not "ld. so. preload" because we want to hide it ). If it is a file we want to hide, we will not return the result to the user.

[...]    if(s_fstat.st_gid == MAGIC_GID || (strstr(pathname, MAGIC_DIR) != NULL) || (strstr(pathname, CONFIG_FILE) != NULL)) {        errno = ENOENT;        return -1;    }[...]


After processing all the above functions in this way, we can hide some files and behaviors from users (attackers. But is there any way we can check it? Continue to see the following

3.2 check for concealment
Whether this hiding method is enough to make your eyes shine, it does kill rkhunter and chkrootkit. Because the two tools use the symbol-based check method, this method is indeed not the best method.

Let's look at the example below:
First, if we clear the LD_PRELOAD variable, a validation file will be generated for a specified program file.

        fluxiux@handgrep:~$ sha1sum toto-dyn         a659c72ea5d29c9a6406f88f0ad2c1a5729b4cfa  toto-dyn        fluxiux@handgrep:~$ sha1sum toto-dyn > toto-dyn.sha1

Then, after LD_PRELOAD is set, check the file's checksum, as shown in the following figure:

        fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so        fluxiux@handgrep:~$ sha1sum -c toto-dyn.sha1         toto-dyn: OK

(It seems that the file has been authenticated)
But is this true?

Obviously not, because we have not actually modified the program file, the file checksum will be the same at any time. If the anti-hiding tool is based on checksum, this is undoubtedly not feasible. Other checking techniques include: checking suspicious files, symbols, port binding detection, and so on, but they also fail because this hiding method is too flexible, and in Jynx we have a sort of port knocking to open the remote shell for our host.


Okay. What else can we do with these things? Check all databases pointed to by LD_PRELOAD, or "/etc/ld. so. preload. We know that "dlsym" is often used to find the original function.

$ strace ./bin/ls[...]open("/home/fluxiux/blabla/Jynx-Kit/ld_poison.so", O_RDONLY) = 3read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\n\0\0\0\0\0\0"..., 832) = 832fstat(3, {st_mode=S_IFREG|0755, st_size=17641, ...}) = 0mmap(NULL, 2109656, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5e1a586000mprotect(0x7f5e1a589000, 2093056, PROT_NONE) = 0mmap(0x7f5e1a788000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f5e1a788000close(3) [...]open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) = 3[...]

Parse the "ld_python.so" file and see a lot of replicas in it. They are all places that may be contaminated. Looking at the string information of these binary files may give us some interesting hints. Of course, if they have been cleverly handled, then there will be no way. But from another perspective, the normal program file needs to hide its own string information.

fluxiux@handgrep:~/blabla/Jynx-Kit$ strings ld_poison.so[...]libdl.so.2[...]dlsymfstat[...]lstat hooked.ld.so.preloadxochi <-- sounds familiar[...]/proc/%s <-- hmmm... strange![...]

This hiding method called Jynx-kit proves that it is unrealistic to use the symbol-based check to cope with the above hiding method. The heuristic check results will be good.

Here: http://fluxius.handgrep.se/2011/10/31/the-magic-of-ld_preload-for-userland-rootkits/





Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.