Generating Machine Code at Runtime
Okay, so my next attempt at learning how my computer works and how to speak machine language is the following C code fragment:
typedef int (*FuncPtr)(); // Create a function: char testFunc[] = { 0x90, // NOP (not really necessary...) 0xB8, 0x10, 0x00, 0x00, 0x00, // MOVL $16,%eax 0xC3 }; // RET // Make a copy on the heap, OS doesn't like executing the stack: FuncPtr testFuncPtr = (FuncPtr) malloc(7); memmove( (void*) testFuncPtr, testFunc, 7 ); printf("Before function.\n"); int result = (*testFuncPtr)(); printf("Result %d\n", result);
Basically, this stores the raw opcodes of a function in an array of chars. The first byte of each line is usually the opcode, i.e. 0x90 is No-Op, 0xB8 is a MOVL into the eax register (with the next 4 bytes being the number to store, in this case 16), and 0xC3 is the return instruction (I had to look up the opcodes in Intel's documentation).
One thing to watch out for here (at least on Mac OS X), is that you'll get a bad access error if you try to execute testFunc directly. That's because testFunc is on the stack, and the stack shouldn't contain executable code (it's a small safety measure). So, what we do is we simply malloc some memory on the heap, and stuff our code in there.
You may wonder why I'm using eax of all registers to store my number 16 in. Easy: Because the convention is that an int return value (and most other 4-byte return values) goes in eax when a function returns. So, what this does is it essentially returns 16. Which our printf() proves. Neat!
Intel's documentation describes the opcodes in a very complicated way, so what I essentially do is I write some assembler code and enclose the instruction whose byte sequence I want to find out in instructions whose byte sequence I already know (I like to use six nops, which are short and show up as 0x90 90 90 90 90 90). Then I compile that, and then use a hex editor to search for the known instructions, and whatever is between them must be my new one. Here's a small table of other operations you may find in the typical program and what byte sequences they turn to:
0x50 | pushl %eax |
0x53 | pushl %ebx |
0x55 | pushl %ebp |
0x89 E5 | movl %esp, %ebp |
0x90 | nop |
0xB8 NN NN NN NN | movl $N, %eax |
0x68 NN NN NN NN | pushl $N |
0xE8 NN NN NN NN | call relativeOffsetNFromEndOfInstruction |
0x8B 1C 24 | movl (%esp), %ebx |
0x8D 83 NN NN NN NN | leal relativeOffsetToData(%ebx), %eax |
0x8D 85 NN NN NN NN | leal relativeOffsetToData(%ebp), %eax |
0x5B | popl %ebx |
0x83 C4 NN | addl $NN,%esp |
0x83 EC NN | subl $NN,%esp |
0x8B 00 | movl (%eax), %eax |
0x89 45 NN | movl %eax, NN(%ebp) |
0xC9 | leave |
0xC3 | ret |
The code fragment above is essentially what one would need to create a just-in-time compiler. For a real compiler, instead of executing this directly, we'd have to write it to a complete MachO file and link it with crt1.o.
Update: on top of the instructions for position-independent code (PIC), I've also added some more useful in passing structs as parameters on the stack.