When 5 + 5 Equals 11: Reflections on Writing a C Compiler, Part One

Posted on August 31, 2025 by John

Compilers ust Systems Programming Computer Science

How complex can a five character expression be? The seemingly innocent i + i++ produces different results on different compilers, exposing one of C's most insidious features: undefined behavior. In this post, I'll explore how something as fundamental as the order of expression evaluation became a minefield in C, why modern compilers still disagree on basic operations, and how I'm addressing these issues in my hobby compiler NCC. Along the way, we'll peek under the hood at compiler intermediate representations and discover why "simple" languages can be surprisingly complex.

What is Undefined Behavior

The C standard defines what constitutes a valid C program and how that program should behave. However, the standard leaves some gaps in its specification. There are two categories of what I'll call underspecified behavior:

Unspecified Behavior: Cases where there are multiple valid behaviors, and the compiler can choose any of them.

An example of this is the order of evaluation of operands within binary expressions. The standard does not mandate whether the left or right operand is evaluated first. In this case the compiler can choose any order, and this choice does not have to be consistent. However, the only valid choice for the compiler is to evaluate those operands then evaluate the expression.

Undefined Behavior: The standard provides no requirements.

In this case the compiler can do whatever it wants. The likely outcome is that the compiler picks a behavior most programmers would expect, but it's under no obligation to do so. This may seem ridiculous at first, but it's not without good reason. Another way to think about undefined behavior is it's the standards way of telling the compiler writer "These situations do not occur, proceed as you wish". By being able to ignore edge cases that well crafted C programs should avoid enables the compiler to make aggressive optimizations; desirable for programs not leveraging undefined behavior. For example the compiler can unroll loops in cases where it can't guarantee that the induction variable won't overflow as the responsibility to avoid overflow, an undefined behavior, is on the programmer. If you want to learn more about this I highly recommend this blog post.

Furthermore, many cases of undefined behavior cannot be detected statically. Runtime checks could be inserted, but they would add significant overhead to all C programs. A simple example of this is accessing an array out of bounds. We can easily insert bounds checks but this adds overhead to all array accesses.

Undefined Behavior in Practice

Undefined behavior is a painful part of C programming responsible for many bugs and security vulnerabilities. C is often described as a small and simple language however the ambiguity that comes with it adds significant complexity for the end user. The decades since C was created have taught us that programmers aren't particularly good at avoiding undefined behavior and much of computer science has focused on how to make programming safer. As an aside, I agree that C is a small Language but would argue that the traditionally considered more "complex" languages like Rust are actually simpler to use due to their more rigorous specifications.

There are many forms of undefined behavior in C. This post will mostly focus on evaluation order in expressions, but we'll touch on a few others at the end. You might be surprised to learn that even the simple snippet below is enough to get inconsistent results.

1/*
2* What should this program return?
3*/
4int main(void) {
5    int i = 5;
6    int result = i + i++;
7    return result;
8}

In standard C, expressions like i + i++ exhibit undefined behavior. The C standard doesn't specify:

Whether the left or right operand is evaluated first
When side effects (like the increment in i++) take effect

With that in mind let's take another look at the snippet above.

With Left-to-Right

Evaluate the left i → 5
Evaluate i++ → 5, then increment i
Add: 5 + 5 = 10

With Right-to-Left

Evaluate i++ → 5, then increment i
Evaluate the left i → 6
Add: 5 + 6 = 11

On GCC we return 11, while on Clang we get 10. Both compilers issue warnings about unsequenced modifications and accesses to i, Clang by default with -Wunsequenced and GCC with -Wsequence-point if warnings are enabled.

There's a few things we need understand here.

The evaluation order of operands in binary expressions is unspecified (not undefined), meaning the compiler can choose any order it wants.
We are accessing (left hand side), and modifying (right hand side), the same object i
There is no sequence point in between the access and the modification, which leads to undefined behavior.

This is called a sequence point violation, and these often occur in non-contrived examples. Function calls with side effects like f() + g() can violate sequence points if both modify the same state. Array indexing expressions such as arr[i] = i++ are another common pitfall. These patterns appear frequently in real code, making sequence point violations one of C's most insidious sources of bugs.

Standardizing Evaluation Order

By standardizing the evaluation order, we can eliminate this source of undefined behavior. A strict evaluation order creates an implicit sequence point between the evaluation of the left and right operands. If only all cases of undefined behavior were this easy to solve!

Naively, one (like myself), may think that this is a trivial problem to solve, just constantly evaluate the left side first. As you'll see that approach is almost sufficient.

Why Doesn't C Do this?

Sequencing evaluation order does come with a hidden cost. Register allocation becomes more difficult as the compiler must ensure that the left hand side is fully evaluated and stored before evaluating the right hand side. Other optimization opportunities are also lost; for example, vectorization may require reordering of operations, prevented with this approach.

Sequencing is certainly the simplest solution, but it's not the only way to solve this problem. Rust, like C, does not sequence evaluation order. However, Rust prevents sequence point violations at compile time via the borrow checker. This is a more complex solution, allowing full optimization opportunities while avoiding undefined behavior.

Basic cases like our example above can be detected with static analysis, but more complex cases involving pointers and aliasing range from difficult to impossible to detect. For example, consider the following:

1#include <stdio.h>
2
3int f(int *p, int val) {
4    *p = val;
5    return val;
6}
7
8int main(void) {
9    int i = 1;
10    int result = f(&i, 2) + f(&i, 3);  // Undefined: order of function calls
11    printf("result = %d, i = %d\n", result, i);
12    return 0;
13}

Neither GCC or Clang warn about this (still quite simple) example.

We can't retrofit a borrow checker into C due to core language constructs like pointer arithmetic. Safety has to be a core part of the language design; something I'll keep in mind if I finish this project and make a language of my own.

Enter NCC

Enter NCC - Not Completely C my hobby C compiler written in Rust. NCC implements a substantial subset of C's control flow and expression evaluation within a single function scope. It roughly follows Writing a C Compiler by Nora Sandler. An excellent read proposing an iterative approach to compiler design. The book describes core concepts of building a compiler without spoon-feeding the reader an implementation. Already, many of the features NCC implements are suggested by but otherwise not covered in the book including pre/post fix operators, compound assignment, bitwise operators and goto statements. The most concise way I can describe what's currently implemented is with the formal grammar, so without further ado:

<program> ::= <function>
<function> ::= "int" <identifier> "(" "void" ")" <block>
<block> ::= "{" { <block-item> } "}"
<block-item> ::= <statement> | <declaration>
<declaration> ::= "int" <identifier> [ "=" <exp> ] ";"
<for-init> ::= <declaration> | [ <exp> ] ";"
<statement> ::= "return" <exp> ";"
            | <exp> ";"
            | "if" "(" <exp> ")" <statement> [ "else" <statement> ]
            | "goto" <identifier> ";"
            | <identifier> ":" <statement>
            | <block>
            | "break" ";"
            | "continue" ";"
            | "while" "(" <exp> ")" <statement>
            | "do" <statement> "while" "(" <exp> ")" ";"
            | "for" "(" <for-init> [ <exp> ] ";" [ <exp> ] ")" <statement>
            | "switch" "(" <exp> ")" <statement>
            | "case" <exp> ":" <statement>
            | "default" ":" <statement>
            | ";"
<exp> ::= <factor> | <exp> <binop> <exp> | <exp> <assign-op> <exp>
       | <exp> "?" <exp> ":" <exp> | <exp> "++" | <exp> "--"
<factor> ::= <int> | <identifier> | <unop> <factor> | "++" <factor> | "--" <factor> | "(" <exp> ")"
<unop> ::= "-" | "~" | "!"
<binop> ::= "-" | "+" | "*" | "/" | "%" | "&" | "|" | "^" | "<<" | ">>" | "&&" | "||"
         | "==" | "!=" | "<" | "<=" | ">" | ">="
<assign-op> ::= "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>="
<identifier> ::= ? An identifier token ?
<int> ::= ? A constant token ?

Don't worry if this doesn't mean anything to you. Basically, I've implemented all major control flow statements and operators allowing the compilation of real programs. The major missing features are functions, structs, arrays, and pointers. The code is available on Github.

How does NCC handle this?

With NCC we aim to enforce a strict left-to-right evaluation order for expressions. Since NCC evaluates the left hand side first, I expected it to return 10, but it was returning 11! As far as this C standard is concerned, this is a valid result. It's also one of two reasonable results in my opinion. That said, it's not what I expected, or want, so let's fix it.

Debugging NCC

NCC has a fairly comprehensive tests suite that was passing before I started this investigation. This gives me confidence that all the operators (inc. ++) are working correctly.

Evaluation Order Verification

Since we're enforcing a strict left-to-right evaluation order, we should have a test case for that behavior. I didn't have much doubt that this was working in the current implementation but it's important to keep in mind when adding optimization passes.

1int main(void) {
2    int b = 0;
3    // If left is evaluated first, b becomes 1 before right side
4    // Result: 0 || 1 = true (returns 5)
5    // If right is evaluated first: 0 || 0 = false (returns 7)
6    if (b++ || b)
7        return 5;
8    return 7;
9}

This test detects evaluation order - returning 5 for left-to-right, 7 for right-to-left.

Simple Binary with Postfix

1int main(void) {
2    int i = 5;
3    int result = i + i++;  // Should be 5 + 5 = 10
4    return result;
5}

As a refresh, here's our original test case.

Checking the IR

With NCC, we have CLI options to dump the program representation at various stages of compilation. NCC uses a three address code intermediate representation called TACKY which follows the parsing and semantic analysis stages. Looking at the generated TACKY intermediate representation of our failed test case immediately revealed the problem:

1Program {
2    function: FunctionDefinition {
3        name: "main", body: [
4            Copy { src: Constant(5), dst: Var("i.1") },                                           # i = 5
5            Copy { src: Var("i.1"), dst: Var("postfix_org.1") },                                  # Save original value of i (5) for i++
6            Binary { op: Add, src1: Var("i.1"), src2: Constant(1), dst: Var("postfix_new.1") },   # Calculate i + 1 = 6
7            Copy { src: Var("postfix_new.1"), dst: Var("i.1") },                                  # BUG: Update i to 6 BEFORE using it in addition!
8            Binary { op: Add, src1: Var("i.1"), src2: Var("postfix_org.1"), dst: Var("temp.1") }, # Add: i (now 6!) + postfix_org (5) = 11
9            Copy { src: Var("temp.1"), dst: Var("result.1") },                                    # Store result
10            Return(Var("result.1")), Return(Constant(0))
11        ]
12    }
13}

Note: ncc --tacky displays a pretty printed tree, but I don't have a way to make that highlighted HTML so we'll make due

The postfix increment was updating i to 6 before the addition operation used the first i. This meant the addition was using the already-incremented value.

The Solution

The fix was to capture the value of the left operand before evaluating the right operand. In the TACKY generation for binary expressions:

1// Before (buggy):
2parser::Expr::Binary(op, e1, e2) => {
3    let src1 = tackify_expr(e1, instructions, name_generator);
4    let src2 = tackify_expr(e2, instructions, name_generator);
5    // src1 is just a reference - will use the current value when executed
6    ...
7}
8
9// After (fixed):
10parser::Expr::Binary(op, e1, e2) => {
11    let mut src1 = tackify_expr(e1, instructions, name_generator);
12    // If src1 is a variable, copy it to capture its current value
13    if let Val::Var(_) = &src1 {
14        let temp = Val::Var(name_generator.next("binary_left"));
15        instructions.push(Instruction::Copy {
16            src: src1.clone(),
17            dst: temp.clone(),
18        });
19        src1 = temp;  // Use the captured value
20    }
21    let src2 = tackify_expr(e2, instructions, name_generator);
22    ...
23}

This ensures that when src1 is a variable, we capture its value in a temporary before evaluating src2 (which might have side effects that modify the variable).

The Result

After the fix, the TACKY IR for our previously failed test case becomes:

1Program {
2    function: FunctionDefinition {
3        name: "main", body: [
4            Copy { src: Constant(5), dst: Var("i.1") },                                                      # i = 5
5            Copy { src: Var("i.1"), dst: Var("binary_left.1") },                                             # FIX: Capture left operand value (5) FIRST
6            Copy { src: Var("i.1"), dst: Var("postfix_org.1") },                                             # Save original value for i++ (5)
7            Binary { op: Add, src1: Var("i.1"), src2: Constant(1), dst: Var("postfix_new.1") },              # Calculate i + 1 = 6
8            Copy { src: Var("postfix_new.1"), dst: Var("i.1") },                                             # Update i to 6 (side effect happens here)
9            Binary { op: Add, src1: Var("binary_left.1"), src2: Var("postfix_org.1"), dst: Var("temp.1") },  # Add: binary_left (5) + postfix_org (5) = 10
10            Copy { src: Var("temp.1"), dst: Var("result.1") },                                               # Store result
11            Return(Var("result.1")), Return(Constant(0))
12        ]
13    }
14}

Now the addition uses the captured value (5) instead of the modified value (6).

By ensuring strict left-to-right evaluation and properly capturing values before side effects can modify them, we've eliminated a class of undefined behavior from our compiler. This makes programs compiled with our compiler more predictable and easier to debug - a worthwhile tradeoff for the slight additional complexity in the compiler implementation.

The fix was surprisingly simple once we understood the problem: just capture variable values in temporaries before evaluating expressions that might modify them. Sometimes the best solutions are the simplest ones!

Other Undefined Behaviors

There are a few other cases of undefined behavior in the subset of C that NCC implements so far.

Signed Integer Overflow
I've defined this to wrap around using two's complement arithmetic. This is consistent with how most modern hardware behaves, and avoids the overhead of detecting overflow at runtime. I'll likely add a flag to enable trapping on overflow in the future.
Uninitialized Variables
I haven't done anything special to handle this case yet, but we could pick a default value and initialize all variables at declaration. This would come with a high performance cost (think arrays) for likely unneeded allocations, and I'm skeptical that zero is substantially better than garbage values. Once again due to pointer arithmetic and aliasing, a static analysis approach like Rust's borrow checker isn't possible.
Divide By Zero
This would have to be another runtime check.

That's just what I'm sure of off the top of my head. There are likely other lurking undefined behaviors; for example, what if you shift a 32-bit integer by 32 or more bits?

Reflection

Writing a C compiler (and writing about it) has so far proved to be an exceptional learning experience. Seeing the faults of C first-hand yields a much deeper understanding of the problems modern languages are trying to solve. Many of the lessons are surprisingly transferable to my career as a data scientist. Designing, manipulating, and pattern matching abstract syntax trees directly parallels working with decision trees, parsing nested JSON structures, and transforming data through pipeline stages. It's also quite helpful to have a strong understanding of what's happening under the hood when one needs to write performant code for handling big datasets.

What's next

For NCC, functions are the next frontier - bringing challenges like stack management, calling conventions, and the question of whether to maintain strict left-to-right evaluation across function boundaries. With that NCC will be able to support basic standard library functions like putchar. Storage-class specifiers (static, extern) will come next which will finally let NCC compile programs with multiple translation units. Once that's all complete, I'll write about the experience and lessons learned from this project before moving onto implementing more complex types. Until then, keep on the lookout for more technical deep dives like this. I believe my recent struggles on implementing switch statements (now fully supported in NCC!) deserves its own post.

Say Hello!

If you found this helpful, spotted something that could be improved, or just want to say thanks—I'd love to hear from you. Shoot me an email at [email protected]. Being a self-hosted blog I don't have any good metrics of readership, so hearing from real people is the best way to know this content is reaching someone. If enough people are interested, I'll get a comment system going.