I find myself often using hex-strings of assembly instructions in C++ programs, for example, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" (snippet from http://www.phrack.org/phrack/49/P49-14, as a canonical example of shellcode). Such hex-strings can often be found in penetration-testing tools, as well as in code-injection tools.
I was working on creating a code-injection tool in C++ last night to help with my malware analysis work. Since the code that I needed to inject was a buffer of x86 assembly instructions, I used RTA to type up the assembly code, saved the file, opened it in my hex editor, copied the instructions as a hex-string, and pasted it into my injector project. I could have used HIEW or OllyDbg or something else instead of RTA; I could have even written the assembly code in an __asm{...} block in C++ and compiled it to get the instructions. However, all of these solutions required copying a hex-string back into my injector program. This gets even more annoying if I want to <gasp> update my assembly code!
I thought, "wouldn't it be nice if I could write the assembly code directly into my C++ program and be able to make use of that buffer without using any hex-strings?"
Well, I decided to implement a solution:
|
typedef struct _ASSEMBLY_BUFFER { void* pBuffer; unsigned long ulSize; } ASSEMBLY_BUFFER, *PASSEMBLY_BUFFER;
// // Gets a pointer to the x86 assembly code buffer starting at the buffer_begin // label. Also gets the size of the buffer. // void __fastcall GetAssemblyBuffer(PASSEMBLY_BUFFER) { __asm { mov eax, offset buffer_begin ; Get address of first instruction in assembly mov [ecx], eax ; buffer and save it to .lpBuffer mov edx, offset buffer_end sub edx, eax ; Determine difference between beginning and end mov [ecx+4], edx ; of assembly buffer, and save it to .dwSize } return;
__asm { buffer_begin:
<assembly code> ; Our assembly code buffer
buffer_end: } } |
Figure 1. GetAssemblyBuffer(...) function and typedef. |
We simply put our assembly code between the buffer_begin and buffer_end labels, and can then use GetAssemblyBuffer(...) to access it.
Take the following program for example:
|
#include <stdio.h>
typedef struct _ASSEMBLY_BUFFER { void* pBuffer; unsigned long ulSize; } ASSEMBLY_BUFFER, *PASSEMBLY_BUFFER;
// // Gets a pointer to the x86 assembly code buffer starting at the buffer_begin // label. Also gets the size of the buffer. // void __fastcall GetAssemblyBuffer(PASSEMBLY_BUFFER) { __asm { mov eax, offset buffer_begin ; Get address of first instruction in assembly mov [ecx], eax ; buffer and save it to .lpBuffer mov edx, offset buffer_end sub edx, eax ; Determine difference between beginning and end mov [ecx+4], edx ; of assembly buffer, and save it to .dwSize } return;
__asm { buffer_begin:
mov eax, 15DBh ; Our assembly code buffer rol eax, 13h xor eax, 0DEADBEEFh shr eax, 10h mov ebx, eax shl eax, 2 add eax, ebx add eax, ebx add eax, ebx add eax, 4
buffer_end: } }
int main(int argc, char** argv) { ASSEMBLY_BUFFER asmbuf = {0}; GetAssemblyBuffer(&asmbuf);
printf("Assembly code buffer:\n"); for (unsigned long i = 0; i < asmbuf.ulSize; i++) { printf("\\x%02x", ((unsigned char*)asmbuf.pBuffer)[i]); }
return 0; } |
Figure 2. Sample program that uses GetAssemblyBuffer(...). |
The program above would output:
|
Assembly code buffer: \xb8\xdb\x15\x00\x00\xc1\xc0\x13\x35\xef\xbe\xad\xde\xc1\xe8\x10\x8b\xd8\xc1\xe0 \x02\x03\xc3\x03\xc3\x03\xc3\x83\xc0\x04 |
Figure 3. Output of sample program above. |
With this functionality, we can now do things like WriteProcessMemory(hProcess, lpBaseAddress, asmbuf.pBuffer, asmbuf.ulSize, lpNumberOfBytesWritten) or send(s, asmbuf.pBuffer, asmbuf.ulSize, flags) without having to paste any hex-strings into our C++ code.