LD_PRELOAD is probably one of the most amusing feature of Linux operating
systems. It is the starting piece of dynamic instrumentation, reverse
engineering madness and every fun userland rootkits. The problem is it is
fairly easy to detect, spoiling the fun for everyone. This article is just a
schizophrenic discussion on trying to detect LD_PRELOAD and implementing
I hope you are already familiar with LD_PRELOAD, if not, go read one of the
many tutorials on the subject. I will only remind that there are only two ways
to register a library to be preloaded by ld.so:
setting the LD_PRELOAD environment variable to our library path
writing the library path in the /etc/ld.so.preload file
The first one has the advantage of being accessible to any users, but is only
effective on processes you launch in that environment, meaning it will not affect
other users. The second one has the advantage of being loaded on every process
of your system but requires root access (on correctly configured machines).
Detecting LD_PRELOAD for dummies
Checking the value of the LD_PRELOAD environment variable, or the presence of
/etc/ld.so.preload are the most common but also most obvious detection
techniques out there. Barely any code is needed as you can see in the example
Of course if we can hook any shared library functions using LD_PRELOAD, there
is nothing preventing our preloaded library to hook the functions used above
and return the “correct” values. Below is an example of such hooks.
It should be noted that many more functions need to be hooked in order to hide
/etc/ld.so.preload. Some have direct effects like readdir(), stat(),
open(). Some have undirect effects, like unlink() or rename(), where
checking errno can indicate if the file does not exist (ENOENT) or if we do not
have the write permission (EACCES) in which case it does exist.
Here is where many people stop their detection and anti-detection attempts,
but for the fun of it, let’s go further, much further.
It’s the last time I call you
Ok, sure, we can intercept calls to any shared library including the libc, but
what if there was a way to check the environment variables without calling any
functions? Indeed, we can check the environment variables by reading the actual
piece of memory holding them, environ.
Because it is a simple and direct memory access, there is no way to intercept
it. But … once the library is loaded into the program we do not really need
that variable anymore. So how could we unset LD_PRELOAD as soon as our library
is loaded? Well through the init function of our library. All we have to do is
write an init() function and send the -init flag to the linker.
Inside the init() function there is not much to do: we save the value of
LD_PRELOAD then remove it from the environment. This means that, as soon as our
library is loaded, LD_PRELOAD will disappear from the environment and the
program will never have any occasion of catching it because it will not execute
any instruction before that. Unfortunately unsetenv() is not very effective
at removing a variable and the value can still be found in /proc/self/environ
and by running the set command. This is why it is reimplemented in the example
Now, if the program forks and loads another binary, our library will not be
preloaded anymore because we removed it from the environment. So we need to
restore that variable before the call to exec(). exec() is a whole family
of functions, but they all redirect to execve(). execve() allows to set the
environment through an array which means we need to create a new modified
environment array to inject our LD_PRELOAD variable. Now the library will be
loaded because LD_PRELOAD is set right before the call. LD_PRELOAD is unset
right after the call for the parent process and through the init function for
the child process, which means it completely disappears for both processes
before they execute any instruction.
You should note that many other functions from the exec familly need to be
hooked. Also, the code here replaces entirely the value of LD_PRELOAD, but to be
exact you should append your library to the variable if it is already set and
only remove your library from the variable instead of unsetting it. A program
could set LD_PRELOAD with a canary value and watch it disappear after a fork,
confirming that some weird LD_PRELOAD magic is going on.
(Un)fortunately the environment variable or the ld.so.preload file are not the
only way of detecting a preloaded library. Another way, which is a bit more
complex to implement, is to read the memory maps of our program and detect the
presence of the memory allocated to the preloaded library. As everything is a
file under linux, it is located in /proc/self/maps.
Here is the normal process map of cat:
And here is the process map of preloaded cat:
As you can see, right between ld.so memory and the libc.so memory, our
library has been loaded. An easy way to detect this is to look if there is
anything else than an anonymous map (the one without name and with a bunch of
zeroes) between the libc.so memory and ld.so memory.
Of course this suffers from the same problem as before except that this time
we can not pretend the file does not exist, we have to present fake memory
maps to the process. Like before we hook the open function (I used fopen()
this time, technically you should also hook open(), open64(), openat64(),
freopen(), etc), but now we create a temporary file where we copy the true
memory maps without the lines related to our preloaded library.
Now if you look at the resulting fake memory maps, you can see there are still
For example the allocated memory blocks are not contiguous anymore. With a more
complete memory maps parser this can be easily fixed, the example from above is
only a proof of concept.
The Kernel Whisperer
I have now exhausted the standard techniques that I know of and it is time for
assembly and kernel tricks. I hope you are familiar with both.
As you might know, the kernel functions, like open() or fork(), are
actually called through a mechanism known as syscall: the syscall function
number is put in eax (or rax under x86_64), the arguments in the other
registers (see man syscall) and then the “int 0x80” instruction
(or “syscall” under x86_64) is executed. This causes a processor interrupt
which is catched by the kernel in charge of granting our wishes. The standard C
library and system library are merely wrappers around this mechanism, providing
a C API for basic OS functions, and those wrappers are what we have been
So if we directly use syscalls to call kernel functions we are bypassing the
entire hooking process. Let’s reimplement the ld.so.preload file detection and
memory maps detection but with syscalls this time.
Now you might be thinking it is over, that there is no way you can do anything
against syscalls with LD_PRELOAD, the only way is to implement kernelspace
hooking. Well … not really.
We might be asking the kernel to execute a system function for us, but he
never said he was going to do it. More specifically, there are two ways of
modifying syscall behavior under linux:
SECCOMP which allows restricting the syscalls a process can make. It is meant
to be used to sandbox processes but it is a little bit trickier when said
process is not aware there should be a sandbox in the first place.
Ptrace which is used to debug processes and allows stopping the process
before and after each syscall.
So the idea would be to ptrace the process, stop it before each syscall and,
if it is an open syscall, redirect the control flow to a hook function.
The first problem is to ptrace ourself. We can not directly ptrace ourself
because it makes no sense, a debugger can not debug itself. But if we fork a
child process, that new process can continue the normal program execution while
its parent debugs it. The only detail is we call sleep() before to give
enough time to the parent process to attach, just in case the child would get
scheduled before the parent by the kernel.
The second problem is to redirect the control flow to our hook function.
Fortunately, under x86_64, the syscall argument convention is the same as the
amd64 gnu ABI, the arguments are placed in RDI, RSI, RDX, etc. We just need to
emulate a call through the ptrace interface by pushing the return address on
the stack and changing the instruction pointer to the first instruction of the
Unfortunately the x86 gnu ABI is different. It requires the arguments to be
placed on the stack but the syscall convention uses registers. Again, we can
emulate this using the ptrace interface to push those registers onto the stack.
The real problem is when we return from our hook function, because we need to
clear those arguments from the stack. The solution is implemented as inline
assembly inside the hook function. This assembly “simply” moves the stack 12
bytes up (12 bytes is the arguments size) before returning.
The last problem is to avoid hooking the syscalls made by our hook function. A
simple variable checked by the hooking code is used there, the only trick
being that the variable needs to be copied from the parent process to the
child using ptrace.
Below is the full implementation of the open syscall emulation.
The big advantage of this approach is that we don’t need to hook all the
variants of open() or fopen() because they all use the same syscall (except
openat(), but you should be able to figure out how to patch it).
The Endless Game
Of course now our program can try to detect if it is being ptraced, but since
we can hook any syscall we want this can also be countered. Another problem is
all the side effects created by our tricks (e.g. if you read the /etc directory
ld.so.preload is still there and our fake memory maps has address
Two other detection mechanisms are also worth mentionning:
The LD_DEBUG and LD_TRACE_LOADED_OBJECTS environment variables which can make
ld.so output debug informations about the libraries being loaded. The same
trick used in noenviron_preload.c can be used to remove those variables when
execve() is called.
The lsof program can list open file descriptors, including the one used for
our preloaded shared library. It finds those informations in the /proc/self/fd/
directory. Simply hiding that file descriptor is enough to make it disappear.
Assuming skills and knowledge are not the limiting factor, the
winning side will always be the one that can adapt and compile last.
Many thanks to @doegox and hastake (he is hard to
find, rumor has it that he is hiding from Vatican Secret Service) for
sparking and correcting some of the ideas in here.