The exploit for a get password program
modifies the return address to change the control flow of the program (in this
case, to circumvent the password protection logic). This technique, which is
known as
arc injection (sometimes referred to as
return-into-libc), involves transferring control to code that already
exists in the program's memory space. Arc injection refers to how these
exploits insert a new arc (control-flow transfer) into the program's
control-flow graph as opposed to injecting code. More sophisticated attacks are
possible using this technique, including installing the address of an existing
function (such as
system() or
exec(), which can be used to
execute commands and other programs already on the local system) on the stack
along with the appropriate arguments. When the return address is popped off the
stack (by the
ret or
iret instruction in IA-32), control is
"returned" to an attacker-specified function. By invoking functions
like
system() or
exec(), an attacker could easily create a
shell on the compromised machine with the permissions of the compromised
program.
Worse yet, an attacker can use arc injection to invoke multiple functions in
sequence with arguments that are also supplied by the attacker. An attacker can
now install and run the equivalent of a small program that includes chained
functions, increasing the severity of these attacks.
Program that is vulnerable to a buffer overflow is shown below.
User-supplied data in
user_input is copied to the
buff
character array on line 4 using
memcpy().A
buffer overflow can result if
user_input is larger than the
1. #include <string.h>
2. int get_buff(char *user_input){
3. char buff[4];
4. memcpy(buff, user_input, sizeof(user_input));
5. return 0;
6. }
7. int main(int argc, char *argv[]){
8. get_buff(argv[1]);
9. return 0;
10. }
buff buffer. Figure 1 (a) shows the contents of the stack
before execution of the
get_buff() function. The stack consists of the
local variable
buff, followed by the frame pointer (
ebp) and
return address (
eip) for
main(). Below this is the actual
stack frame for
main() (which is referenced by the stored frame
pointer).
Figure 1 (b) shows the contents of the stack after an attacker has
overflowed
buff to overwrite the contents of the stack. This portion of
the stack has been completely overwritten by the overflow.
An attacker may be able to place data in the actual buffer, but in this
example we'll assume that the buffer is overwritten with fill characters.
The frame pointer for
main() has been overwritten with a frame pointer
for Frame 2. This entire frame has been manufactured by the attacker as part of
the exploit. When the exploited function (
get_buff()) returns, it
executes one of two equivalent forms of the frame pointer-based return sequences
shown in Figure 1 (c). Regardless of which form is used, the frame
pointer (now pointing to Frame 2) is moved into the stack pointer. Control is
returned to the address on the stack, which has been overwritten with the
address of an arbitrary function
f(). This function is called and
passed the arguments installed on the stack. The attacker must provide the
appropriate number and type of arguments assumed by the invoked function. In
Figure 1 (b), we assume that the function accepts a pointer to a string
(for example,
"system()"). Because the actual contents of the
string also need to be provided, the string is placed on the stack after the
actual arguments to the function.
|
Figure 1 |
When
f() returns, it pops the stored
eip off the stack and
transfers control to this address. In this case, the
eip has been
overwritten with the address of the return sequence shown in Figure 1
(c). This sequence is usually the instructions generated for the return to the
exploited function, but it can appear anywhere in the code segment for the
process. The return sequence assigns the frame pointer (now pointing to Frame 3)
to the stack pointer and returns control to the the next arbitrary function to
be called (in this case,
g()).
An attacker can repeat this sequence as required to invoke a sequence of
functions to accomplish an exploit. The attacker could also reproduce the
original frame contents on the stack to return control to
main() after
the exploit has executed.
An attacker may prefer arc injection over code injection for several reasons.
Because arc injection uses code already in memory on the target system, the
attacker merely needs to provide the addresses of the functions and arguments
for a successful attack. The footprint for this type of attack can be
significantly smaller and may be used to exploit vulnerabilities that cannot be
exploited by the code injection technique. Arc injection is a data-based attack
that cannot be defeated by making memory segments (such as the stack)
nonexecutable.
Chaining function calls together allows for more powerful attacks. A
security-conscious programmer, for example, might follow the principle of least
privilege and drop privileges when not required. By chaining
multiple function calls together, an exploit could regain privileges, for
example, by calling
setuid() before calling
system().