Bytecode VM in C
A from-scratch implementation of a tree-walk interpreter and bytecode VM for Lox. Follows Crafting Interpreters — but with a detour: jlox gets rewritten in C before touching clox, forcing every memory ownership decision to be explicit. No GC, no safety net.
"i dont know why im doing this"
Compilation pipeline
Scanner
Source → token stream. Lexemes are pointer + length into the source — no copies.
Parser
Recursive descent. Produces a tagged-union AST, heap-allocated.
Compiler
Single-pass AST walk. Emits bytecode instructions into a dynamic array.
VM
Stack-based execution loop. Dispatches opcodes, manages the value stack.
Design decisions
Why translate jlox to C first
The book moves from Java to C in one jump. Doing jlox → C manually makes every implicit Java behaviour a conscious decision: where does this list live? Who frees it? That friction is the entire point.
Tagged union vs void*
AST nodes use a tagged union — a type enum plus a union of possible payloads. More verbose than a class hierarchy, but completely transparent. You can read a switch and know exactly what is happening.
AST node
struct Expr {
ExprType type;
union {
BinaryExpr binary;
GroupingExpr grouping;
LiteralExpr literal;
UnaryExpr unary;
} as;
};No vtables. A type tag plus a union. Every traversal is a switch on expr->type.
Opcode set (partial)
Java handles memory, strings, and exceptions quietly. C does none of that. When you write a scanner in Java you return an ArrayList and move on. In C you decide where the list lives, who owns it, and when it gets freed.
Progress
Scanner complete
Token stream, whitespace handling, string/number literals.
Parser + AST
Recursive descent, tagged-union nodes, printer passes.
Compiler (in progress)
Single-pass bytecode emission for expressions.
VM execution loop
Variables + scoping
GC