← projects 01 wip 2026

Bytecode VM in C

A from-scratch implementation of a tree-walk interpreter and bytecode VM for Lox. Follows Crafting Interpreters — but with a detour: jlox gets rewritten in C before touching clox, forcing every memory ownership decision to be explicit. No GC, no safety net.

"i dont know why im doing this"

4 pipeline stages scan → parse → compile → execute
~600 lines so far scanner + parser + AST
0 dependencies pure C, stdlib only
segfaults survived

Compilation pipeline

01

Scanner

Source → token stream. Lexemes are pointer + length into the source — no copies.

02

Parser

Recursive descent. Produces a tagged-union AST, heap-allocated.

03

Compiler

Single-pass AST walk. Emits bytecode instructions into a dynamic array.

04

VM

Stack-based execution loop. Dispatches opcodes, manages the value stack.

Design decisions

Why translate jlox to C first

The book moves from Java to C in one jump. Doing jlox → C manually makes every implicit Java behaviour a conscious decision: where does this list live? Who frees it? That friction is the entire point.

Tagged union vs void*

AST nodes use a tagged union — a type enum plus a union of possible payloads. More verbose than a class hierarchy, but completely transparent. You can read a switch and know exactly what is happening.

AST node

ast.h
struct Expr {
  ExprType type;
  union {
    BinaryExpr   binary;
    GroupingExpr grouping;
    LiteralExpr  literal;
    UnaryExpr    unary;
  } as;
};

No vtables. A type tag plus a union. Every traversal is a switch on expr->type.

Opcode set (partial)

OP_CONSTANT Push a constant value onto the stack
OP_ADD / SUB / MUL / DIV Binary arithmetic — pop two, push result
OP_NEGATE Unary negation — pop one, push negated
OP_RETURN Return from current call frame
OP_PRINT Pop and print the top of stack
OP_JUMP_IF_FALSE Conditional branch for if/while

Java handles memory, strings, and exceptions quietly. C does none of that. When you write a scanner in Java you return an ArrayList and move on. In C you decide where the list lives, who owns it, and when it gets freed.

Progress

Apr 2026

Scanner complete

Token stream, whitespace handling, string/number literals.

Apr 2026

Parser + AST

Recursive descent, tagged-union nodes, printer passes.

May 2026

Compiler (in progress)

Single-pass bytecode emission for expressions.

upcoming

VM execution loop

upcoming

Variables + scoping

upcoming

GC