The New Jorviker

How do JITs work?

Earlier today, for whatever reason, I was struck with the idea of writing my own JIT – not rewriting V8 for Javascript or anything like that. I just wanted to see if I could translate my abstract understanding of how a JIT works into reality. So I gave it a shot and pushed it to Github, calling it Tinyjit.

Truth is, what I have there right now doesn’t technically merit the term “just-in-time compiler”. It doesn’t compile anything. At the moment, the binary I’ve written will just square a given integer:

$ ./tinyjit 5
25

What’s interesting about it is the way that it squares the number. Instead of writing a function that looks like this:

unsigned int square(unsigned int a) {
    return a * a;
}

I have a section of code that looks like this:

static constexpr uint8_t square_procedure_code[] = {
  0x55,             /* push %rbp */
  0x48, 0x89, 0xe5, /* mov %rsp, %rbp */
  0x89, 0x7d, 0xfc, /* mov %edi, -0x4(%rbp) */
  0x8b, 0x45, 0xfc, /* mov -0x4(%rbp), %eax */
  0x0f, 0xaf, 0xc0, /* imul, %eax, %eax*/
  0x5d,             /* pop %rbp */
  0xc3              /* retq */
};

This is the x86 machine code basically corresponding to the C++ function I wrote above. I define a function interface – taking one unsigned integer as input and returning one unsigned integer as output:

typedef unsigned int (*procedure)(unsigned int);

Code is data and data is code. I’d like to just execute this code verbatim. And by that, I mean I’d like to write something like the following:

static procedure square_procedure =
    reinterpret_cast<procedure>(square_procedure_code);

int main() {
    std::cout << square_procedure(5) << std::endl;
}

If you do that, you’ll get a segmentation fault at runtime. The problem is that the page your square_procedure_code variable is allocated in is not executable. There are several reasons for this. Some are security related. You don’t want a buffer overrun to cause you to execute arbitrary code. And others are more technical. x86 is a modified Harvard architecture, not a Von Neumann architecture, so there are separate storage and cache pathways for instructions and data.

The way that I get around this is by allocating an executable page in memory, copying the code into the page, and only then executing it. That looks like this:

void init_procedures() {
  square_procedure = reinterpret_cast<procedure>(
    mmap(nullptr, sizeof(square_procedure_code),
         PROT_EXEC | PROT_READ | PROT_WRITE,
         MAP_PRIVATE | MAP_ANONYMOUS,
         -1, 0));
  memcpy((void*)square_procedure, (void*)square_procedure_code,
      sizeof(square_procedure_code));
}

Take a look at the mmap documentation. It’s really all you need. We request a mapping big enough for our code that’s not backed by a file, making it readable, writable and executable. Then, we copy our code from above into it.

Now we can use our function pointer like any other:

int main(int argc, char ** argv) {
    if (argc != 2) {
        std::cerr << "USAGE: " << argv[0] << " to_square" << std::endl;
        exit(1);
    }

    init_procedures();

    int to_square = std::atoi(argv[1]);
    std::cout << square_procedure(to_square) << std::endl;
}

If you were going to actually write a JIT, you would be programmatically generating the contents of square_procedure_code. You’d probably also need a spec for some sort of language, as well as a lexer and parser. But that doesn’t particularly interest me at the moment. I think there are plently of dynamically typed languages out there.

At the moment, I think the most promising applications of runtime code generation like this are in much more tactical places. Logic-oriented programming is an area with some ripe fruit. But general logic programming is very ambitious. I like smaller domains. Think about regular expression engines that compile a regex down to actual machine code. Now that’s warp speed.