Wednesday, November 2, 2016

HEAP OVERFLOW - Overwriting the Heap with the vulnerable function strcpy


- The Heap is a memory area whose function is to store variables assigned dynamically at runtime. In the C language the instruction malloc() is used to allocate memory to the heap, so that the access to that memory area is via a pointer returned by malloc().

- For example: char * c = malloc (40); where 40 bytes are allocated to the pointer *c during program execution.

- The Heap is organized by the compiler itself, so overflow attacks are more difficult to exploit and replicate than the remaining overflow attacks. In addition, there is a large dependency on the compiler used, as well as libraries installed.  

- The following example uses the environment CodeBlocks 13.2 of C programming with GNU GCC compiler under Windows operating system. Any test of the program in other environment or compiler would probably give a different result, since as been said dynamic memory allocation at runtime depends heavily on the environment, compiler and used libraries.


- In this exercise a successful Heap Overflow attack is performed, taking advantage of the inherent vulnerability of strcpy function of language C.

- C language function strcpy copies the string pointed at the source in the array pointed to the destination, including the final null character, returning it once the destination array makes the copy.

- Its structure is as follows:

- strcpy is vulnerable to overflow because it is not able to control the size of the copy. It can occur that the copy destination array remains overwritten, overlapping memory locations. To avoid this problem, the size of source array should be at least equal to the target array.

- The goal of the program is to show how by introducing a crafted argument it can be achieved to modify the expected output of the program, taking advantage of the vulnerability of the strcpy function, working specifically on the area dynamic allocation memory (heap).

- The program is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])

char *segundo = malloc(5); // segundo is declared before primero
char *primero = malloc(5);

printf("Saludo (%p): %s\n",primero, primero);

printf("Saludo (%p): %s\n", primero, primero);

- At first glance, the normal operation of the program would be printing twice the string HOLA, which has been copied by the strcpy function pointer named primero, since it is repeated twice the same statement: printf("Saludo (%p): %s\n",primero, primero);

- However, the intermediate instruction strcpy (segundo, argv [1]), depending on the length of argv [1], allows to alter the normal operation of the program, since argv [1]  could overwrite the contents of the first pointer.

- To analyze the behavior of the previous program, both in normal operation (printing HOLA HOLA) as well as malfunctioning (HOLA ADIOS), it is used Immunity Debugger for a Windows environment.

- First, the program is run with the input argument ADIOS, aiming to show that this argument itself has no significance on the output of the program:

The instruction strcpy(segundo, argv [1]) loads into the dynamic memory pointer segundo the string ADIOS, entered as the parameter argv [1], in the direction 00580FB8:

- The string HOLA is allocated in 0058FD8.  In the following memory dump it is observed the distance between ADIOS (segundo) and HOLA (primero).

- As explained later, to make possible the future rewriting is imperative that segundo is declared before primero, so that segundo occupies a lower memory area than primero

- As discussed above, if input argv[1] is ADIOS, program execution would be normal, without occurring overwriting, :

- Now, let's see what happens when argv [1] argument is manipulated both in length and content.

- Running the program again, now with a parameter input consisting of 32 numbers plus the string ADIOS (later, we will see what is the reason of introducing exactly 32 numbers):


The instruction strcpy(primero, "HOLA") loads the string HOLA in the memory position 003C1058:

- The memory dump shows that 48 4F 4C 41 (HOLA in hexadecimal) occupies the  003C1058 memory location:

- Initially the input parameter argv [1] is located into 003C0EE9:

However, the instruction strcpy (segundo, argv [1]) loads into the dynamic memory pointer segundo the string "12345678901234567890123456789012ADIOS", introduced previously as the argv [1] parameter:

- Thus, the input parameter argv[1] is finally housed in the memory 003C1038:

- What is the distance between 003C1038 and HELLO(003C1058)? it is exactly 32 bytes, because in hexadecimal 0x003C1058 - 0x003C1038 = 0x20 = 32d. 

- This is the reason because the amount of numbers to be entered in the input parameter to rewrite HOLA is 32, plus finally the string ADIOS.

- In this way, the memory location 003C1058 where HOLA was stored is now overwritten with the string ADIOS:


- The implementation of the program and its output confirms the previous memory dump:

- Let's notice that the first printf ("Saludo (%p):%s\n", primero, primero) is placed in the code before the overwriting, with the output:

Saludo (003C1058): HOLA

- However, the second printf("Saludo(%p): %s\n", primero, primero) is run after the overwriting, so the output is different:

Saludos (003C1058): ADIOS

- A remarkable aspect of this overflow is that it was achieved by calculating the exact location and size of the overwriting, which avoids a final exit program error.