How does the modulus operator work?

As a programmer, I’ve written a line or two of code that includes the modulus operator (i.e. “return x % 2“).  But, never have I paused to think: “How does the underlying system carry out this operation?” In this post, I limit “underneath the hood” to the lowest level (human readable) programming language: assembly.

So, I’ll take a program (written in C language) and dump it into assembly instructions. Then, I’ll explain each instruction in detail.

My background in assembly programming

Up until a few weeks ago, I had never studied assembly—I did flip threw a page or two of an assembly book, when a colleague mentioned, about five years ago, that his uncle programmed in assembly—and I certainly underestimated the role that assembly plays in my career.

Sure—the days of writing pure assembly language evaporated decades ago, but the treasure lies in understanding how programs, written in higher level languages (e.g “C, Perl, PHP, Python”), ultimately boil down to assembly instructions.

Modulus program

So, I contrived a trivial modulus program written in C—let’s dissect it:

// modulus.c

int modulus(int x, int y){
    return x % y;
}

Converting C code into assembly

Before a program can be executed by an operating system, we must first convert the program into machine code—we call this compiling.  Before we run our program (modulus.c) through the compiler, we need to discuss two arguments that we’ll pass to the compiler, in order to alter the default behavior.  The first argument, -0g, disables the compiler from optimizing the assembly instructions.  By default, the compiler (we’re using gcc) intelligently optimizes code—one way is by reducing the number of instructions.  The second argument, -S, instructs the compiler to stop at  just before it creates the machine code (unreadable by humans), and, instead, directs the compiler to create a file (modulus.s) containing the assembly instructions.

# gcc -Og -S modulus.c

The command above outputs a file, modulus.s, with the contents (unrelated lines removed):

modulus:
movl	%edi, %eax
cltd
idivl	%esi
movl	%edx, %eax
ret

Let’s step through and explain each of the assembly instructions, one-by-one.

mov

When we want our CPU to perform any action, such as adding, substracting, multiplying, or dividing numbers, we need to first move bytes of data (an intermediate value, or from memory) to a register.  We move data to registers with the mov command, which is capable of moving data from:

  • register to memory
  • memory to register
  • register to register
  • immediate value to memory
  • immediate value to register

The assembly above first moves data from one register (%edi) to another register (%eax).  This is necessary since subsequent instructions, such as cltd, rely on data being present in the %eax register.

cltd

cltd stands for “convert long to double long.”  But before we dig into why we need this instruction, we must detour and briefly explain the next instruction in line, idivl.

When we send an idivl (divide long) instruction, the processor calcutes the numbers and store the quotient in one register, and stores the remainder in another.  These two values stored in the register are half the size; if the dividend is 32 bits, the cpu stores (1) 16-bit value in one register, and the other 16-bit value in another.

Therefore, if we expect a 32-bit quotient and 32-bit remainder, then the dividend (which is, presumably, 32-bits) must be doubled—to 64-bits.  In our modulus program, we set a return value of type int.

idivl

idivil divides the numerator (stored in %eax) by the argument (or denominator) — the argument that we passed to the instruction.  In our assembly example above, idivl divides the value stored in %rax, by the value in %esi—x and y, respectively.

movl

At the end of a sequence, assembly (by default) returns whatever value is stored in the register %rax.  Therefore, the final instruction moves the remainder, not the quotient, from %edx to %rax.

Wrapping up

I hope I was able to share a thing or two on how a higher level program ultimately breaks down into simple assembly instructions.

[1] I was banging my head against the wall until I found a easy to understand reason why we must convert long to double long: http://www.programmingforums.org/post12771.html