Wednesday, November 2, 2016

ALTERATION OF A PROGRAM AT RUNTIME - Example 1 - Bypassing authentication

ALTERATION OF A PROGRAM AT RUNTIME - Example 1 - OllyDgb with Windows - Bypassing authentication

- The goal of this exercise is to alter the execution flow of a program at runtime. 

- The program is written in C language, and its normal operation consists of the user entering the password "PARIS" and the program showing the message "Password OK". In case of entering an invalid password the program answers "Password NOT OK".

- The alteration consists on bypassing the authentication so that any invalid password would be considered valid by the program. To solve this problem there are many alternatives.

- A simple solution resides on altering only 1 assembly instruction for the executable program, accessed through the debugger OllyDbg.

- First, we should study the code of the program in C language.

- The following statement prints on the screen the string "Password:" inviting the user to enter the password:

printf ( "Password");

- The program reads the password string that the user is entering, printing as the user is typing:

gets (password);

- The program compares the string "PARIS" (valid password) with the new string entered by the user.

if (strcmp (password, "PARIS") == 0)

- Then, depending on the result of the comparison, the program will jump either conditionally to print the message of invalid password ...

printf ( "Password NOT OK \ n");

... or calling the passwordOK() function ...

passwordOK ();

... whose code prints on the screen the valid password prompt:

printf ( "Password OK \ n");

- To solve the problem it can be introduced an unconditional jump (JMP, in assembler) to the memory address where passwordOK() routine begins just after the user has entered his password, and before it occurs the comparison with "PARIS".

- In this way it prevents the program to jump conditionally (JNZ, depending on the result of the comparison) to one place or another.

- Let's examine the solution in assembly language using the debugger OllyDbg.

- Once the executable file in OllyDbg loaded, we stand on the entry point of the program, pressing the blue triangle on the toolbar.

- In the analytical column, fragments of assembly language related to the corresponding program in C language are easily observed

- In this first fragment the program reads the password from the screen as the user is introduced.

- See how the program calls to the gets C language function in the memory address 00401378, using the CALL instruction <JMP.& msvcrt.gets.>:

Then, by calling the C function strcmp, assembler CALL <JMP. & Msvcrt.strcmp>, located in the memory address 0040138C, the program compares the password entered by the user with the string "PARIS". Depending on the result, it is run or not a conditional jump JNZ SHORT pac1.0040139F, because this instruction depends on the sign flag Z. That is, if the result of the comparison is 1 the flag z (Z flag = 1) is triggered, which means that the password entered by the user is different from "PARIS" and the message "password NOT OK" is printed on the screen.

- It is important to note that the destination address 0040139F of the conditional jump JNZ holds the routine printing screen message "Password NOT OK".

- Here, the entire code snippet recently commented:

- Now, to alter the program successfully, so that the program displays the message valid password regardless of its content, it will have to interrupt the flow of control of the normal program, executing an unconditional jump to the PasswordOK () function, just before that can run the comparison previously studied.

- It is therefore important to detect that the PasswordOK() function begins at the memory address 004013E6:

- To alter the program at runtime on the 0040138C instruction calling strmcpy (we do not want that instruction to be run), a right click is made in the displayed menu option, clicking Assemble:

- In the empty space the unconditional jump instruction JMP SHORT 004013E6 is introduced, which refers to the code snippet PasswordOK() routine, beginning as we saw in the memory address 004013E6. The jump is of SHORT type because it is a close jump within the same code segment, less than 127 bytes away.

- The debugger OllyDbg enters 3 NOP (0x90 in hexadecimal) to fill the empty spaces after replacing one instruction for the other one. This ensures an smooth execution of the program, avoiding the possibility of unwanted rewriting of some memory addresses:

The EIP register shows that the next instruction to be executed after 0040138C (unconditional jump JMP) instruction is 004013E6 where passwordOK() routine starts. Thus, successive values ​​of EIP and the instructions executed are:

- Finally, entering the passwordOK () routine:

- It is noticed that once the program is resumed, after the previous change, entering different erroneous passwords (such as "hello", "goodbye", or a string of different nonsense characters), however the program recognizes all of them as valid. So it can be considered that the implemented alteration achieved the desired objective of bypassing the authentication:

FORMAT STRING ATTACKS - Disclosure of information and DoS


- Format String vulnerabilities result from data entry in a program under the guise of "format strings". The format strings are characters used in the input and output functions to specify the conversion between a data set and a string of characters.

- Thus, vulnerability is a result of the interpretation by the program of data inputs as instructions or commands of language itself. The consequences of the attack would be for example the execution of arbitrary code, reading or dumping the stack as a protected information disclosure, and denial of service, all affecting the security and stability of the system.

- Although C and C ++ languages are prone to suffer from Format String attacks, other programming languages are also vulnerable to these attacks, such as Perl, PHP, Java, Ruby, etc ... 

- The below links offer information about Format String attacks in several different languages than C / C ++:

- During the resolution of this exercise the C language has been used  because it is traditionally which has suffered most attacks of Format String type.

- To explain the attack should be noted the following:

a) Function Format: ANSI C standard converts a variable primitive programming language as a format function representation in the form of readable string for a human being. For example: printf, fprintf, sprintf, etc ... are examples of language functions in format C. Such functions are called variadics, and characterized by accepting a variable number of arguments. As we will be seen below, the arguments can be of two types: on the one hand the argument that characterizes the format output, on the other hand the values to be formatted.

b) Format String: this is the set of arguments used in the function format and are composed of text and parameters ASCIIZ (ASCII Code 0, strings ending in 0) type. Its utility is to specify and control the representation of variables. They are quoted, for example: printf ( "Today is November day %d. \ n", 22);

c) Conversion Character: this is the parameter that defines the type of conversion to be performed by the format function. For example %d (integer, reads an integer from memory), %f (Floating point, reading a real number format from memory), %c (char, a single character), %s (string, reading from the memory of a string of characters), %x (reading from the stack an hexadecimal number), etc ...

- As mentioned above, the attack would be implemented by inserting entries maliciously crafted, what would not be adequately validated by the program, so that the behavior function format would be different than expected.

- Regarding comparison with Stack Overflow attacks, both attacks seem interested in making a malicious usage of the stack. However, while the attack Stack Overflow is specially designed to rewrite the contents of the stack, forcing the program to execute arbitrary instructions, the Format String attack focuses on using converters from C language, for example %s, %x, etc ..., so that the stack interpreters the converters as part of the entered parameter in the function as an argument.

- The consequences of an Stack Overflow attack usually result in altering the flow control program, which executes its instructions leading to different outputs from initial purpose. However, Format String attacks are oriented either to the disclosure of information stored in certain memory locations or the denial of service (DoS). Both cases can be checked in detail in the following example.

1 - FORMAT STRING ATTACK - EXAMPLE - Disclosure of information and DoS

- To ilustrate the Format String attack a simple program (fs.c) written in C will be used.

- The original purpose of the program is simply to return to the screen the argument entered by the user at the command line.

- However, the program contains proprietary information ("INFORMACION 1" and "INFORMACION 2") that might be disclosed using a Format String attack.

- Editing and compiling the program in a Linux enviroment:

- Let's examine the code of the program. Two pointers (*s1 and *s2) are defined to memory locations that store certain reserved information:

Also, an instruction has been introduced which aims simply to print out the argument argv [1] inserted by the user via the command line: 

printf (argv [1]);

- Thus, the purpose of the program in not at all intended to "reveal" the information stored in s1 and s2, but simply return to the screen the argument entered by the user. 

- However, let's see how through manipulation of the command line input the result can be very different than expected.

- First, let's enter a string in the command line argument, which is printed below the screen, according to the proper purpose of the program:

- However, let's see what happens when the arguments are entered by the user conversion characters.

- Introducing %s:

- Introducing %s%s:

- Introducing HOLA%s%s:

- It is observed that by introducing in the command line the converter %s a "revelation of information" (information disclosure, information leakage) occurs, so that the program behavior differs from the original purpose of it.

- As defined in the program, pointers *s1 and *s2 are stored on the stack in consecutive positions, prior to argument argv [1] expected to be received by the line command.

- Thus, upon receiving as input the conversion character %s, the printf function reads from the stack the nearest string, printing it. Upon receiving %s%s it performs the same operation with the two strings stored in the stack.

- To check the above concepts and analyze the contents of the stack the gdb GNU debugger  will be used on the program.

- The disassembly of the main function shows that the call to printf function occurs in the memory address 0x08048433:

- Setting a breakpoint just before the call to printf, at the above address 0x08048433:

-The program is run with the input argument HOLA:

- The content of the stack is analyzed:

Arguments received by the printf routine are stored in the lower memory addresses of the stack:

- The stack would have the following content, from low to high memory, placing the esp pointing to the direction 0xbffff598, which contains the string HOLA introduced by command line:

This explains that while printf reads arguments stored on the stack they are consecutively printed on the screen. In this case only HOLA because it has been executed the gdb command (gdb) run HOLA:

- Now, what would happen in case of introducing into the program many string parameters, beyond the values stored on the stack? The program would begin to read meaningless memory addresses, printing strange characters:

Finally, in case of entering more converters % s the result would be the failure of the program (segmentation fault), because it would be trying to access invalid addresses. For example:

- See the same result in gdb:

The program finishes running with a SIGSEGV signal indicating an invalid memory access, or segmentation fault.

In this late case the attack would result into a denial-of-service attack (DoS), since the program fails without performing the purpose for it was written.

- Another interesting converter for conducting String Format attacks is converter % x, which reads the stack in hexadecimal values.

 - Let's see what happens with gdb running on four %x converters :

- At that point of the execution, the contents of the stack are:

- The program runs to the end, and the output matches the contents of the stack, except the first memory location (whose content will be seen soon):

The first contents of the stack (lowest memory addresses) are the converters themselves %x% x%x%x, introduced as parameters:

- It can be observed the same result running the program directly from the command line, with fout %x converters arranged in the form of 0x%08x. The result would be the dump of the stack content

- It means that the memory addresses stored on the stack in hexadecimal format would be obtained:

Since the program output matches the contents of the stack it is concluded that the information disclosure attack has been successful.

HEAP OVERFLOW - Overwriting the Heap with the vulnerable function strcpy


- The Heap is a memory area whose function is to store variables assigned dynamically at runtime. In the C language the instruction malloc() is used to allocate memory to the heap, so that the access to that memory area is via a pointer returned by malloc().

- For example: char * c = malloc (40); where 40 bytes are allocated to the pointer *c during program execution.

- The Heap is organized by the compiler itself, so overflow attacks are more difficult to exploit and replicate than the remaining overflow attacks. In addition, there is a large dependency on the compiler used, as well as libraries installed.  

- The following example uses the environment CodeBlocks 13.2 of C programming with GNU GCC compiler under Windows operating system. Any test of the program in other environment or compiler would probably give a different result, since as been said dynamic memory allocation at runtime depends heavily on the environment, compiler and used libraries.


- In this exercise a successful Heap Overflow attack is performed, taking advantage of the inherent vulnerability of strcpy function of language C.

- C language function strcpy copies the string pointed at the source in the array pointed to the destination, including the final null character, returning it once the destination array makes the copy.

- Its structure is as follows:

- strcpy is vulnerable to overflow because it is not able to control the size of the copy. It can occur that the copy destination array remains overwritten, overlapping memory locations. To avoid this problem, the size of source array should be at least equal to the target array.

- The goal of the program is to show how by introducing a crafted argument it can be achieved to modify the expected output of the program, taking advantage of the vulnerability of the strcpy function, working specifically on the area dynamic allocation memory (heap).

- The program is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])

char *segundo = malloc(5); // segundo is declared before primero
char *primero = malloc(5);

printf("Saludo (%p): %s\n",primero, primero);

printf("Saludo (%p): %s\n", primero, primero);

- At first glance, the normal operation of the program would be printing twice the string HOLA, which has been copied by the strcpy function pointer named primero, since it is repeated twice the same statement: printf("Saludo (%p): %s\n",primero, primero);

- However, the intermediate instruction strcpy (segundo, argv [1]), depending on the length of argv [1], allows to alter the normal operation of the program, since argv [1]  could overwrite the contents of the first pointer.

- To analyze the behavior of the previous program, both in normal operation (printing HOLA HOLA) as well as malfunctioning (HOLA ADIOS), it is used Immunity Debugger for a Windows environment.

- First, the program is run with the input argument ADIOS, aiming to show that this argument itself has no significance on the output of the program:

The instruction strcpy(segundo, argv [1]) loads into the dynamic memory pointer segundo the string ADIOS, entered as the parameter argv [1], in the direction 00580FB8:

- The string HOLA is allocated in 0058FD8.  In the following memory dump it is observed the distance between ADIOS (segundo) and HOLA (primero).

- As explained later, to make possible the future rewriting is imperative that segundo is declared before primero, so that segundo occupies a lower memory area than primero

- As discussed above, if input argv[1] is ADIOS, program execution would be normal, without occurring overwriting, :

- Now, let's see what happens when argv [1] argument is manipulated both in length and content.

- Running the program again, now with a parameter input consisting of 32 numbers plus the string ADIOS (later, we will see what is the reason of introducing exactly 32 numbers):


The instruction strcpy(primero, "HOLA") loads the string HOLA in the memory position 003C1058:

- The memory dump shows that 48 4F 4C 41 (HOLA in hexadecimal) occupies the  003C1058 memory location:

- Initially the input parameter argv [1] is located into 003C0EE9:

However, the instruction strcpy (segundo, argv [1]) loads into the dynamic memory pointer segundo the string "12345678901234567890123456789012ADIOS", introduced previously as the argv [1] parameter:

- Thus, the input parameter argv[1] is finally housed in the memory 003C1038:

- What is the distance between 003C1038 and HELLO(003C1058)? it is exactly 32 bytes, because in hexadecimal 0x003C1058 - 0x003C1038 = 0x20 = 32d. 

- This is the reason because the amount of numbers to be entered in the input parameter to rewrite HOLA is 32, plus finally the string ADIOS.

- In this way, the memory location 003C1058 where HOLA was stored is now overwritten with the string ADIOS:


- The implementation of the program and its output confirms the previous memory dump:

- Let's notice that the first printf ("Saludo (%p):%s\n", primero, primero) is placed in the code before the overwriting, with the output:

Saludo (003C1058): HOLA

- However, the second printf("Saludo(%p): %s\n", primero, primero) is run after the overwriting, so the output is different:

Saludos (003C1058): ADIOS

- A remarkable aspect of this overflow is that it was achieved by calculating the exact location and size of the overwriting, which avoids a final exit program error.

INTEGER OVERFLOW - Altering the result of an arithmetic operation


- Integer Overflow happens when an arithmetic operation attempts to create a numeric value that is too large to be represented in its allocated storage space. 

- In programming, a variable is a memory space reserved to store a value corresponding to a data type supported by the language. Programming languages ​​have several types of variables, and measurement memory space reserved for the variable depends on the type of variable that is defined.

- For example, ANSI C assigns these values to the Integer type:

- If higher than permitted values ​​are assigned to char, short or int type variables, the program execution will not give any error, but truncate values.

- ISO C99 considers that the result of an integer overflow is of "undefined behavior", which means that standard compilers can do whatever they want, from completely ignoring the overflow to abort the program. What do most compilers is to ignore the integer overflow.

- The integer overflows can not be detected until they have occurred. This can be dangerous if the calculation has to do with the size of a buffer or the index of an array. 

- Most integer overflow are not exploitable because memory is not being directly overwritten, but sometimes they can lead to other kinds of bugs. Integer overflow attacks will not allow to overwrite memory areas, variables or code, but they can change the application logic and even outflank memory structures created through unsafe variables. In other words, the result can be unexpected given the resulting value will not be provided according to the logic defined in the program.

- To prevent an integer overflow, checking numerical values must be comprehensive so that there are no unexpected errors, including a check to detect whether the entered values ​​are between a range of certain values, ​​and of course, check measurement data type before using it.

1 - INTEGER OVERFLOW EXAMPLE - Altering the result of an arithmetic operation

- Let's consider this program written in language C:

- The program accepts 3 arguments (int, int, char), sums the 3 of them and finally outputs the result. 

-In case of proper input, arguments must be inside the range of accepted int and char type values. Let's see a normal operation for this program with values 1, 2, and 3:

- However, if the program is provided with arguments ​​which sum is outside the valid range, the sum exceeds the capacity of the outcome variable obtaining a negative number. For instance, let's see what happens with this input:

- How is it possible that the sum of those numbers is 0, instead of 2147483647 + 2147483647 + 2 = 4294967296

- Let's examine why the compiler considers that this sum is 0, instead of the expected result 4294967296.

- First of all it is important to notice that the number 4294967294 is out of the scope of the int type value range (-2147483648 to +2147483647).

- To understand the compiler's mind, we need to take the numbers written in a Two's Complement format, what is the usual way that negative numbers (starting by 1) are represented by compilers:

- For instance, in Two's Complement format the number 1101 would not be 13 d, but -3 d:

(-1)*2^3 + 1*2^2 + 0*2^1 + 1*2^1 = -3 d

- So, the int arguments (2147483647) of this example must be converted from decimal to Two's Complement binary system format:

- Then,  summing both up in a binary way:

  0111 1111 1111 1111 1111 1111 1111 1111 = 2147483647 d
  0111 1111 1111 1111 1111 1111 1111 1111 = 2147483647 d
  1111 1111 1111 1111 1111 1111 1111 1110 = ?

- In the reverse way than previously, calculating the result of the binary sum 1111 1111 1111 1111 1111 1111 1111 11110 from Two's Complement format to decimal:

- So, the compiler takes the result of the partial sum as the negative -2. Then, the total sum would be 0:

(-2) +2 = 0 

- Another possible integer overflow case ​​for the same program would be as follows:

- In this late case:

- Summing:

0111 1111 1111 1111 1111 1111 1111 1111 = 2147483647 d
0111 1111 1111 1111 1111 1111 1000 0010 = 2147483522 d
1111  1111 1111 1111 1111 1111 1000 0001= ?

- Converting the result from Two's Complement to decimal:


- The final total sum:

(-127) + 127 = 0