Add learning tools: Anki deck, IR quiz, reading order, trace exercises

tools/anki_gen.py: generates 389-card Anki deck (.apkg) from hand-crafted
YAML + auto-parsed source (IrOp variants, memory constants, error types,
peephole patterns, primitive registrations, boot.fth defs, Runtime trait).

tools/anki_data.yaml: 71 hand-crafted cards covering architecture, design
decisions, ForthVM internals, codegen, optimizer, boot.fth, control flow,
Runtime trait, and testing infrastructure.

tools/ir_quiz.py: interactive terminal quiz (41 exercises) — predict
optimized IR for Forth code (constant fold, peephole, strength reduce,
DCE, tail call, inlining).

tools/reading_order.md: guided 23-step codebase reading sequence.
tools/trace_exercises.md: 20 trace-the-compilation exercises with answers.
tools/architecture.txt: single-page ASCII system reference.
This commit is contained in:
2026-04-13 10:52:47 +02:00
parent 73bcee960b
commit ea34b7cb52
7 changed files with 2711 additions and 0 deletions
+718
View File
@@ -0,0 +1,718 @@
# WAFER Anki Card Data
# Hand-crafted cards for architecture, design decisions, and "why" questions.
# Auto-generated cards (IrOp variants, memory addresses, etc.) are created by anki_gen.py.
# ============================================================================
# CATEGORY A: Architecture
# ============================================================================
architecture:
- front: "What are the 5 stages of the WAFER compilation pipeline?"
back: "Forth Source → Outer Interpreter (tokenize + dispatch) → IR (Vec<IrOp>) → Optimizer (6 passes) → WASM Codegen (wasm-encoder) → wasmtime execution"
code: |
// lib.rs doc comment:
// Forth Source -> Outer Interpreter -> IR -> Optimize -> WASM Codegen
source: "crates/core/src/lib.rs:9"
tags: [architecture, basic]
- front: "What crate does WAFER use to generate WASM bytecode?"
back: "`wasm-encoder` — builds WASM modules programmatically (types, imports, functions, code sections). NOT a text-format assembler."
source: "crates/core/src/codegen.rs:12"
tags: [architecture, basic]
- front: "What crate does WAFER use to execute WASM modules?"
back: "`wasmtime` — Bytecode Alliance WASM runtime. Provides Engine, Store, Module, Instance, Memory, Global, Table, Func."
source: "crates/core/src/outer.rs:14"
tags: [architecture, basic]
- front: "How many source files are in wafer-core? Name them."
back: "12 files: outer.rs (9820), codegen.rs (4205), optimizer.rs (1013), dictionary.rs (906), export.rs (409), runner.rs (402), ir.rs (259), consolidate.rs (169), memory.rs (148), error.rs (84), config.rs (61), lib.rs (28), js_loader.rs (163)"
tags: [architecture, basic]
- front: "What is the relationship between a Forth word and a WASM module in WAFER?"
back: "Each compiled Forth word becomes its own WASM module with one function. Modules share memory, globals (dsp/rsp/fsp), and a function table via wasmtime imports. Words call each other via call_indirect through the shared table."
tags: [architecture, intermediate]
- front: "What are the 6 imports every WAFER WASM module expects?"
back: |
1. emit (func: i32 → void) — character output callback
2. memory (16 pages = 1 MiB) — shared linear memory
3. dsp (global mut i32) — data stack pointer
4. rsp (global mut i32) — return stack pointer
5. fsp (global mut i32) — float stack pointer
6. table (funcref) — shared function table for call_indirect
source: "crates/core/src/codegen.rs:25-41"
tags: [architecture, intermediate]
- front: "What are the 3 types of words in WAFER?"
back: |
1. IR primitives — compiled to WASM via Vec<IrOp>, inlineable (DUP, +, @)
2. Host functions — Rust closures in wasmtime, NOT inlineable (., .S, M*, ACCEPT)
3. Forth-defined words — compiled by outer interpreter (: SQUARE DUP * ;)
source: "crates/core/src/outer.rs:2422-2478"
tags: [architecture, basic]
- front: "What is the role of outer.rs in WAFER?"
back: "Contains ForthVM<R: Runtime> — the complete Forth virtual machine, generic over execution backend. Outer interpreter (tokenize → lookup → interpret/compile), all primitive registration, control-flow compilation, DOES> support. 8703 lines."
source: "crates/core/src/outer.rs:1"
tags: [architecture, basic]
- front: "What is the Runtime trait and why does it exist?"
back: |
Defined in runtime.rs. Abstracts over WASM execution backends:
- Memory r/w (mem_read_i32, mem_write_slice, etc.)
- Globals (get/set_dsp, rsp, fsp)
- Table (table_size, ensure_table_size)
- Module lifecycle (instantiate_and_install, call_func)
- Host functions (register_host_func)
Two implementations: NativeRuntime (wasmtime), WebRuntime (js-sys).
ForthVM<R: Runtime> is completely decoupled from any specific WASM engine.
source: "crates/core/src/runtime.rs:72"
tags: [architecture, intermediate]
- front: "What is HostAccess and how is it used?"
back: |
Trait for memory/global access from within host function callbacks.
Methods: mem_read_i32, mem_write_i32, mem_read_u8, mem_write_u8, mem_read_slice, mem_write_slice, mem_len, get/set_dsp, get/set_rsp, get/set_fsp, call_func.
NativeRuntime implements it via CallerHostAccess (wrapping wasmtime::Caller).
WebRuntime implements it via WebHostAccess (wrapping js_sys typed arrays).
HostFn = Box<dyn Fn(&mut dyn HostAccess) -> Result<()>> — same closure works on both runtimes.
source: "crates/core/src/runtime.rs:17"
tags: [architecture, intermediate]
- front: "What are the 3 crates in the WAFER workspace?"
back: |
1. wafer-core — compiler, optimizer, codegen, dictionary, Runtime trait, outer interpreter
Feature flags: default=["native"], "native" enables wasmtime
2. wafer (cli) — CLI REPL (rustyline), wafer build/run commands
3. wafer-web — browser REPL (wasm-bindgen + WebRuntime + HTML/CSS/JS frontend)
source: "Cargo.toml workspace"
tags: [architecture, basic]
- front: "What is NativeRuntime?"
back: |
Wasmtime-based implementation of Runtime trait (runtime_native.rs, 328 lines).
Owns: Engine, Store<NativeVmHost>, Memory, Table, Globals (dsp/rsp/fsp), emit_func.
instantiate_and_install: Module::new + Instance::new with 6 imports.
register_host_func: creates wasmtime Func that bridges HostFn → CallerHostAccess.
Behind "native" feature flag.
source: "crates/core/src/runtime_native.rs:107"
tags: [architecture, intermediate]
- front: "What is WebRuntime?"
back: |
Browser-based implementation of Runtime trait (crates/web/runtime_web.rs, 542 lines).
Uses js_sys::WebAssembly for module instantiation.
Memory: JsValue wrapping WebAssembly.Memory, accessed via Int32Array/Uint8Array views.
Globals: JsValue wrapping WebAssembly.Global objects.
Host functions: JS closures created via Closure::wrap, stored in _closures Vec to prevent GC.
Runs entirely in the browser — no wasmtime dependency.
source: "crates/web/src/runtime_web.rs:12"
tags: [architecture, intermediate]
- front: "What is WaferRepl?"
back: |
The wasm-bindgen entry point in crates/web/src/lib.rs.
Wraps ForthVM<WebRuntime>.
Methods: new() (create VM), evaluate(input) (returns output), data_stack(), is_compiling(), reset().
Built with: wasm-pack build --target web --out-dir www/pkg
source: "crates/web/src/lib.rs:13"
tags: [architecture, intermediate]
- front: "What feature flags does wafer-core have?"
back: |
default = ["native"]
native — enables dep:wasmtime and all native-only modules:
runtime_native.rs, runner.rs, export.rs, consolidate.rs, js_loader.rs
Without "native" — pure Rust only: dictionary, IR, optimizer, codegen, outer interpreter, runtime trait.
wafer-web uses wafer-core without "native" feature.
source: "crates/core/Cargo.toml:11-12"
tags: [architecture, intermediate]
- front: "What is the role of codegen.rs?"
back: "Translates optimized IR (Vec<IrOp>) to WASM bytecode using wasm-encoder. Handles DSP caching, scratch locals, stack-to-local promotion. Builds complete WASM modules including imports, types, function sections."
source: "crates/core/src/codegen.rs:1"
tags: [architecture, basic]
- front: "What is batch_mode in ForthVM and why does it exist?"
back: "During boot (register_primitives), batch_mode=true defers WASM compilation. All ~40 IR primitives are collected, then compiled into a single WASM module via compile_batch(). This amortizes runtime compilation overhead — one rt.instantiate_and_install() instead of 40."
code: |
self.batch_mode = true;
// ... register all primitives ...
// deferred_ir.push((word_id, ir_body));
self.compile_batch()?;
source: "crates/core/src/outer.rs"
tags: [architecture, advanced]
# ============================================================================
# CATEGORY B: Design Decisions ("Why" cards)
# ============================================================================
design_decisions:
- front: "Why does each Forth word compile to its own WASM module?"
back: |
1. Incremental compilation: defining a new word doesn't recompile anything
2. Isolation: each word is independently validated by wasmtime
3. wasmtime linking model: modules share imports (memory, globals, table)
4. REPL-friendly: immediate feedback, no whole-program recompile
Trade-off: call_indirect overhead between words (mitigated by CONSOLIDATE)
tags: [design, advanced]
- front: "Why use an IR instead of compiling Forth directly to WASM?"
back: |
1. Optimization: IR enables peephole, constant folding, inlining, DCE, tail calls
2. Separation of concerns: outer interpreter doesn't need to know WASM encoding
3. Portability: IR could target other backends
4. Testability: IR is easy to inspect and test
5. Consolidation: IR bodies are stored for later recompilation into single module
tags: [design, advanced]
- front: "Why does WAFER use wasm-encoder + wasmtime instead of Cranelift directly?"
back: |
1. Standard WASM: output is valid .wasm, can run in browsers AND natively
2. Runtime abstraction: same WASM bytes work on wasmtime (CLI) and browser (js-sys)
3. wasmtime handles Cranelift internally — best JIT performance for free
4. Portability: WASM is platform-independent
5. Validation: wasmtime validates modules, catching codegen bugs
6. wasm-encoder is simple: just build bytes, no complex IR
tags: [design, advanced]
- front: "Why was the Runtime trait introduced?"
back: |
To support the browser REPL without duplicating the entire ForthVM.
Before: ForthVM directly owned wasmtime types (Engine, Store, Memory, etc.)
After: ForthVM<R: Runtime> is generic — same compiler code works with:
- NativeRuntime (wasmtime) for CLI/tests/AOT
- WebRuntime (js-sys) for browser
Host functions use HostFn = Box<dyn Fn(&mut dyn HostAccess)> — one closure definition serves both runtimes.
The refactor extracted ~1100 lines of wasmtime-specific code from outer.rs into runtime_native.rs.
tags: [design, advanced]
- front: "Why are stacks in linear memory instead of WASM locals?"
back: |
Default: stacks live in linear memory (data stack at 0x0600, grows down).
Reason: Forth semantics require stack introspection (DEPTH, PICK, SP@), which WASM locals can't provide.
Optimization: stack-to-local promotion lifts values into WASM locals when the compiler can prove stack depth is statically known (no calls, no SP@). Best of both worlds.
tags: [design, advanced]
- front: "Why is the DSP cached in a WASM local?"
back: |
The data stack pointer (dsp) is a wasmtime global. Globals are slower than locals. So:
1. At function entry: local.get $dsp_global → local.set $cached_dsp
2. During function: all stack ops use local $cached_dsp
3. Before calls: write back local → global (callee needs correct dsp)
4. After calls: reload local from global (callee may have changed dsp)
5. At function exit: write back
Net effect: most operations avoid global access overhead.
code: |
const CACHED_DSP_LOCAL: u32 = 0;
fn dsp_writeback(f) { local.get 0; global.set $dsp }
fn dsp_reload(f) { global.get $dsp; local.set 0 }
source: "crates/core/src/codegen.rs:56-181"
tags: [design, advanced]
- front: "Why does boot.fth replace Rust host functions with Forth definitions?"
back: |
1. Self-hosting goal: maximize Forth, minimize Rust
2. Performance: compiled Forth with inlining + optimization beats host function dispatch (call_indirect → Rust closure has overhead)
3. Inlinability: Forth definitions have IR bodies that the optimizer can inline; host functions cannot be inlined
4. Consolidation: Forth words participate in single-module recompilation (direct calls); host functions always use call_indirect
source: "crates/core/boot.fth:1-3"
tags: [design, advanced]
- front: "Why does WAFER use -1 (all bits set) for TRUE instead of 1?"
back: |
Forth 2012 standard: TRUE = -1 (0xFFFFFFFF). All bits set.
Reason: allows bitwise AND as a conditional select: flag AND value.
If TRUE were 1, AND would only preserve the lowest bit.
With -1: TRUE AND x = x (identity). FALSE AND x = 0.
code: |
// In codegen, bool_to_forth_flag:
// 0 - result: if result=1 => -1, if result=0 => 0
f.instruction(&I32Const(0));
f.instruction(&LocalGet(tmp));
f.instruction(&I32Sub);
source: "crates/core/src/codegen.rs:214-222"
tags: [design, intermediate]
- front: "Why does the optimizer run peephole 5 times across the pipeline?"
back: |
Each optimization pass can create new peephole opportunities:
- After inline: inlined body may have adjacent ops that simplify
- After constant fold: folded constants may create identity patterns (PushI32(0), Add)
- After strength reduce: new patterns from reduced ops
- After DCE: dead code removal may leave adjacent simplifiable ops
The peephole pass itself runs to fixpoint (inner loop), but the outer pipeline runs it 5 times at different stages.
code: |
// Phase 1: peephole → fold → strength → peephole
// Phase 2: inline → peephole → fold → strength → peephole
// Phase 3: dce → peephole
// Phase 4: tail_call
source: "crates/core/src/optimizer.rs:37-85"
tags: [design, advanced]
- front: "Why is tail call detection the LAST optimizer pass?"
back: |
1. TailCall emits WASM `return` after the call — if inlining converts TailCall back to Call (detailcall), early tail-call detection is wasted
2. DCE might eliminate the tail position entirely
3. Need return-stack balance check on FINAL IR, not intermediate
4. Inlining must happen first so we know which calls remain
source: "crates/core/src/optimizer.rs:79-84"
tags: [design, advanced]
- front: "Why can't words with Exit be inlined?"
back: |
WASM `return` exits the CURRENT function. If an inlined word contains Exit (→ return), it would exit the CALLER's function, not just the inlined code. There's no 'return from inline' in WASM. The contains_exit() guard prevents this.
code: |
fn contains_exit(ops: &[IrOp]) -> bool {
// Also blocks ForthLocalGet/Set — would collide with caller's locals
matches!(op, IrOp::Exit | IrOp::ForthLocalGet(_) | IrOp::ForthLocalSet(_))
}
source: "crates/core/src/optimizer.rs:633-664"
tags: [design, advanced]
- front: "Why does CONSOLIDATE exist?"
back: |
Normal JIT: each word = separate module, calls via call_indirect (table lookup).
CONSOLIDATE: merges all JIT-compiled words into ONE module.
- call_indirect → direct `call` (for words in the module)
- wasmtime can optimize across call boundaries
- ~2-3x speedup for call-heavy code
External calls (host functions) remain call_indirect.
source: "crates/core/src/consolidate.rs:1-9"
tags: [design, advanced]
- front: "Why does WAFER use a linked list for the dictionary instead of a hash map?"
back: |
1. Forth standard specifies linked-list traversal semantics (TRAVERSE-WORDLIST)
2. Dictionary lives in linear memory (simulates WASM memory layout)
3. Standard requires specific entry format (link + flags + name + code field)
BUT: WAFER also has a HashMap index for O(1) fast-path lookup, falling back to linked-list walk for words not yet indexed. Best of both worlds.
source: "crates/core/src/dictionary.rs:10-48"
tags: [design, intermediate]
- front: "Why does WAFER store Forth flags as -1/0 instead of 1/0 in comparisons?"
back: |
Forth 2012 standard requires: TRUE = -1 (all bits set), FALSE = 0.
WASM comparisons produce 0/1, so codegen must convert:
bool_to_forth_flag: 0 - result → -1 if true, 0 if false
This is a single i32.sub instruction (cheap).
tags: [design, intermediate]
# ============================================================================
# CATEGORY C: ForthVM Struct
# ============================================================================
forthvm:
- front: "What is `user_here` in ForthVM?"
back: "Pointer to next free address in WASM linear memory for user data (variables, CREATE'd words). Separate from dictionary.here() which tracks dictionary-internal allocation. Synced to SYSVAR_HERE (memory offset 12) before each evaluate call."
source: "crates/core/src/outer.rs:212"
tags: [forthvm, intermediate]
- front: "What is `ir_bodies` in ForthVM?"
back: "HashMap<WordId, Vec<IrOp>> — stores the optimized IR body of every compiled word. Used by: (1) optimizer's inline pass to look up callee bodies, (2) CONSOLIDATE to recompile everything, (3) wafer build to export."
source: "crates/core/src/outer.rs:243"
tags: [forthvm, intermediate]
- front: "What is the `control_stack` in ForthVM?"
back: "Vec<ControlEntry> — compile-time stack for nested control flow. IF pushes ControlEntry::If, DO pushes ControlEntry::Do, etc. THEN/LOOP/REPEAT pop and emit the corresponding IrOp. Not the runtime return stack — this is purely compile-time."
source: "crates/core/src/outer.rs:197"
tags: [forthvm, intermediate]
- front: "What is `pending_actions` in ForthVM?"
back: "Arc<Mutex<Vec<PendingAction>>> — queue of actions from host functions that need compiler-side processing. Used by COMPILE, (CompileCall), CS-PICK, CS-ROLL, and POSTPONE of control-flow words. Processed after immediate word returns."
source: "crates/core/src/outer.rs:229"
tags: [forthvm, advanced]
- front: "What is `pending_define` in ForthVM?"
back: "Arc<Mutex<Vec<i32>>> — signals from host functions to the outer interpreter: 1=CONSTANT, 2=VARIABLE, 3=CREATE, 4=EVALUATE. Host function sets the code, outer interpreter reads it after execution and performs the action."
source: "crates/core/src/outer.rs:227"
tags: [forthvm, advanced]
- front: "What does `does_definitions` store?"
back: "HashMap<WordId, DoesDefinition> — for each DOES>-based defining word, stores: create_ir (code before DOES>), does_action_id (WordId of code after DOES>), has_create flag. Used when the defining word executes to set up new instances."
source: "crates/core/src/outer.rs:216"
tags: [forthvm, advanced]
- front: "What happened to the `emit_func` field in ForthVM?"
back: "It moved into the Runtime implementation. NativeRuntime owns emit_func as a wasmtime::Func. WebRuntime creates it as a JS closure. ForthVM no longer directly holds wasmtime types — it only interacts via the Runtime trait."
source: "crates/core/src/runtime_native.rs:116"
tags: [forthvm, intermediate]
- front: "What are `two_value_words` and `fvalue_words`?"
back: "HashSet<u32> tracking which word IDs are 2VALUEs or FVALUEs. TO needs to know: regular VALUE stores 1 cell, 2VALUE stores 2 cells, FVALUE stores 1 float (8 bytes). Without these sets, TO wouldn't know the storage semantics."
source: "crates/core/src/outer.rs:237-239"
tags: [forthvm, advanced]
- front: "How many fields does ForthVM have? Name the major groups."
back: |
~35 fields in 7 groups:
1. Runtime: rt: R (generic — replaces old engine/store/memory/table/dsp/rsp/fsp/emit_func)
2. Compilation: state, compiling_name, compiling_ir, control_stack, compiling_word_id, compiling_locals
3. Output: output (Arc<Mutex<String>>)
4. Dictionary bridge: dictionary, user_here, here_cell, base_cell
5. Word metadata: ir_bodies, host_word_names, word_pfa_map, does_definitions
6. Host function shared state: pending_define, pending_actions, pending_does_patch, throw_code, word_lookup
7. Config + advanced: config, batch_mode, deferred_ir, marker_states, conditional_skip_depth, substitutions, search_order, next_wid, toplevel_ir
source: "crates/core/src/outer.rs:173-260"
tags: [forthvm, advanced]
# ============================================================================
# CATEGORY D: Codegen Details
# ============================================================================
codegen:
- front: "What WASM local index is the cached DSP?"
back: "Local 0 (CACHED_DSP_LOCAL). At function entry: global.get $dsp → local.set 0. All stack ops use local 0. Scratch locals start at SCRATCH_BASE = 1."
source: "crates/core/src/codegen.rs:58-61"
tags: [codegen, basic]
- front: "What does `dsp_writeback` do and when is it called?"
back: "Writes the cached DSP local back to the $dsp global: `local.get 0; global.set $dsp`. Called before: (1) call_indirect/call (callee needs correct dsp), (2) function exit (return)."
code: |
fn dsp_writeback(f: &mut Function) {
f.instruction(&LocalGet(CACHED_DSP_LOCAL))
.instruction(&GlobalSet(DSP));
}
source: "crates/core/src/codegen.rs:167-173"
tags: [codegen, intermediate]
- front: "How does codegen emit IrOp::Dup?"
back: |
Dup = peek top of stack, push copy:
1. peek(f): local.get $dsp; i32.load (value now on WASM operand stack)
2. push_via_local(f, SCRATCH_BASE): local.set $tmp; dsp_dec; local.get $dsp; local.get $tmp; i32.store
source: "crates/core/src/codegen.rs:359-362"
tags: [codegen, intermediate]
- front: "How does codegen emit IrOp::Call(id)?"
back: |
1. dsp_writeback (callee needs correct dsp)
2. If id == self_word_id (self-recursion): emit direct `call WORD_FUNC`
3. Else: i32.const fn_index; call_indirect (type_void, table 0)
4. dsp_reload (callee may have changed dsp)
source: "crates/core/src/codegen.rs (emit_op Call branch)"
tags: [codegen, intermediate]
- front: "What is EmitCtx and what fields does it have?"
back: |
Carries context for WASM code emission:
- f64_local_0, f64_local_1: scratch locals for float ops
- forth_local_base: base WASM local for Forth locals ({: ... :})
- loop_local_base: base local for DO/LOOP index/limit pairs
- loop_locals: Vec<(index_local, limit_local)> stack for nested loops
- fast_loop_depth: nesting depth of loops using local fast path
- self_word_id: Option<WordId> for self-recursion detection
- open_blocks: Vec<u32> for flat forward branches (CS-ROLL)
source: "crates/core/src/codegen.rs:229-250"
tags: [codegen, advanced]
- front: "What are TYPE_VOID and TYPE_I32 in codegen?"
back: "Type section indices: TYPE_VOID=0 is () → () (used by most word functions and call_indirect), TYPE_I32=1 is (i32) → () (used by the emit import function)."
source: "crates/core/src/codegen.rs:44-45"
tags: [codegen, basic]
- front: "How does the codegen handle DO/LOOP?"
back: |
Fast path: index and limit stored in WASM locals (no return stack).
- DO: pop limit and index from data stack into locals
- Loop body: I (RFetch) reads from index local
- LOOP: increment index local, compare with limit, br_if to loop start
- LEAVE: set SYSVAR_LEAVE_FLAG, break out of loop
Fallback: if loop is too complex, use return stack (rpush/rpop).
source: "crates/core/src/codegen.rs (DoLoop handling)"
tags: [codegen, advanced]
# ============================================================================
# CATEGORY E: Boot.fth
# ============================================================================
boot_fth:
- front: "What are the 7 phases of boot.fth?"
back: |
1. Stack/memory: DEPTH, PICK, 2OVER, 2ROT, WITHIN, 2@, 2!, FILL, CMOVE, MOVE, ERASE, /STRING, -TRAILING
2. Double-cell arithmetic: D+, DNEGATE, D-, DABS, D0=, D0<, D=, D<, D2*, D2/, DMAX, DMIN, M+, DU<
3. Mixed arithmetic: SM/REM, FM/MOD, */, */MOD
4. HERE and ALIGNED: HERE, ALLOT, comma, C-comma, ALIGN
5. I/O + pictured numeric output: TYPE, SPACES, <# HOLD HOLDS SIGN # #S #> . U. .R U.R D. D.R
6. DEFER support: DEFER!, DEFER@
7. String ops + misc: COMPARE, -TRAILING, SOURCE, FALIGNED, SFALIGNED, DFALIGNED
source: "crates/core/boot.fth"
tags: [boot, intermediate]
- front: "How is DEPTH defined in boot.fth and why?"
back: |
: DEPTH SP@ 5632 SWAP - 2 RSHIFT ;
5632 = DATA_STACK_TOP (0x1600). Stack grows down, so depth = (top - sp) / 4.
SP@ must come first — it reads dsp BEFORE DEPTH's own literal pushes affect it.
2 RSHIFT = divide by 4 (arithmetic right shift, CELL_SIZE=4).
code: |
: DEPTH SP@ 5632 SWAP - 2 RSHIFT ;
source: "crates/core/boot.fth:12"
tags: [boot, intermediate]
- front: "What magic numbers appear in boot.fth and what do they mean?"
back: |
5632 (0x1600) = DATA_STACK_TOP
1472 (0x05C0) = PICT_BUF_TOP (also WORD_BUF_BASE)
12 = SYSVAR_HERE offset
28 = SYSVAR_HLD offset
64 = INPUT_BUFFER_BASE
24 = SYSVAR_NUM_TIB offset
source: "crates/core/boot.fth"
tags: [boot, intermediate]
- front: "How does pictured numeric output work in boot.fth?"
back: |
<# initializes HLD to PICT_BUF_TOP (1472)
HOLD decrements HLD and stores a character (grows downward)
# extracts one digit: divides ud by BASE via two UM/MODs, converts digit to ASCII, HOLDs it
#S calls # repeatedly until ud is zero
#> returns (c-addr u) pointing to the formatted string in the pictured buffer
SIGN adds '-' if the original number was negative
code: |
: <# 1472 28 ! ;
: HOLD 28 @ 1- DUP 28 ! C! ;
: # BASE @ >R 0 R@ UM/MOD R> SWAP >R UM/MOD
SWAP DUP 9 > IF 7 + THEN 48 + HOLD R> ;
source: "crates/core/boot.fth:193-224"
tags: [boot, advanced]
- front: "Why is . (dot) defined in Forth instead of as a Rust host function?"
back: |
: . DUP ABS 0 <# #S ROT SIGN #> TYPE SPACE ;
1. Self-hosting goal: Forth definitions > Rust
2. Compiled Forth with inlining beats host function dispatch
3. . becomes inlineable (IR body available to optimizer)
4. Participates in CONSOLIDATE (direct calls in single module)
5. Respects BASE correctly via pictured numeric output
source: "crates/core/boot.fth:228"
tags: [boot, advanced]
# ============================================================================
# CATEGORY F: Testing & CLI
# ============================================================================
testing:
- front: "What are eval_output and eval_stack test helpers?"
back: |
eval_output("forth code") → creates ForthVM, evaluates code, returns output String
eval_stack("forth code") → creates ForthVM, evaluates code, returns data stack as Vec<i32>
Both create a fresh VM for each test (isolated).
source: "crates/core/src/outer.rs (test module)"
tags: [testing, basic]
- front: "How does the compliance test infrastructure work?"
back: |
1. boot_with_prerequisites(): create VM, load tester.fr, core.fr, utilities.fth, coreexttest.fth
2. run_suite(vm, file): reset #ERRORS to 0, load test file, read #ERRORS from data stack
3. Assert #ERRORS == 0 for pass
4. 11 word sets tested: Core, Core+, CoreExt, Double, Exception, Facility, File, Float, Locals, Memory, String
source: "crates/core/tests/compliance.rs"
tags: [testing, intermediate]
- front: "What are the 4 ways to run WAFER?"
back: |
1. `wafer` — interactive CLI REPL (rustyline, NativeRuntime)
2. `wafer file.fth` — evaluate file and exit (NativeRuntime)
3. `wafer build file.fth` — compile to .wasm or --native executable
4. Browser REPL — wasm-pack build crates/web, serve www/, WebRuntime
Also: `wafer run file.wasm` — execute pre-compiled module
source: "crates/cli/src/main.rs:58-83"
tags: [cli, basic]
- front: "How does the native executable trick work (wafer build --native)?"
back: |
1. AOT-compile WASM via wasmtime Engine::precompile_module()
2. Read current wafer binary
3. Append: [wafer binary] + [precompiled payload] + [metadata JSON] + [24-byte trailer]
4. Trailer: payload_len(8) + metadata_len(8) + "WAFEREXE"(8)
5. On startup, check_embedded_payload() reads trailer, extracts payload, runs it
code: |
const NATIVE_MAGIC: &[u8; 8] = b"WAFEREXE";
const TRAILER_SIZE: u64 = 24;
source: "crates/cli/src/main.rs:12-14"
tags: [cli, advanced]
# ============================================================================
# CATEGORY G: Control Flow Compilation
# ============================================================================
control_flow:
- front: "Name all 13 ControlEntry variants."
back: |
If, IfElse, Do, Begin, BeginWhile, BeginWhileWhile,
PostDoubleWhileRepeat, PostDoubleWhileRepeatElse,
Case, Of, QDo, Ahead, BeginRef, ForwardBlock
source: "crates/core/src/outer.rs:36-105"
tags: [control_flow, advanced]
- front: "How does IF...ELSE...THEN compile?"
back: |
1. IF: push ControlEntry::If { then_body: [] }; subsequent IR goes to then_body
2. ELSE: pop If, push ControlEntry::IfElse { then_body, else_body: [] }; subsequent IR goes to else_body
3. THEN: pop IfElse (or If), emit IrOp::If { then_body, else_body }
The IR is a tree — nested bodies, not flat branches.
tags: [control_flow, intermediate]
- front: "How does DO...LOOP compile?"
back: |
1. DO: push ControlEntry::Do { body: [] }; subsequent IR goes to body
2. LOOP: pop Do, emit IrOp::DoLoop { body, is_plus_loop: false }
3. +LOOP: same but is_plus_loop: true
The limit and index are expected on the data stack before the DoLoop executes.
tags: [control_flow, intermediate]
- front: "How does BEGIN...WHILE...REPEAT compile?"
back: |
1. BEGIN: push ControlEntry::Begin { body: [] }
2. WHILE: pop Begin, push ControlEntry::BeginWhile { test: body, body: [] }
(everything before WHILE becomes the test)
3. REPEAT: pop BeginWhile, emit IrOp::BeginWhileRepeat { test, body }
tags: [control_flow, intermediate]
- front: "What is CASE...OF...ENDOF...ENDCASE compilation?"
back: |
1. CASE: push ControlEntry::Case { prefix, endof_branches: [] }
2. OF: duplicate test value, compare, pop Case, push ControlEntry::Of
3. ENDOF: pop Of, save (test, body) pair, push back Case with new branch
4. ENDCASE: pop Case, emit nested If chain from endof_branches
Desugared into nested IrOp::If at compile time.
tags: [control_flow, advanced]
- front: "What are CS-PICK and CS-ROLL and why are they complex?"
back: |
Programming-Tools words that manipulate the compile-time control stack.
CS-PICK duplicates a control-flow entry N deep (e.g., reference a BEGIN from inside nested structures).
CS-ROLL rotates control-flow entries (e.g., move an IF dest across other structures).
Complex because they break the structured control-flow assumption — WAFER linearizes these into Block/BranchIfFalse/EndBlock IR ops for flat forward branches.
source: "crates/core/src/outer.rs:99-105"
tags: [control_flow, advanced]
# ============================================================================
# CATEGORY H: Consolidation & Export
# ============================================================================
consolidation:
- front: "What does compile_consolidated_module() produce?"
back: |
A single WASM module containing ALL compiled Forth words as separate functions.
- Each word gets a function index within the module
- Call(id) where id is in the module → direct `call N` (not call_indirect)
- Call(id) where id is NOT in the module → call_indirect (host functions)
- TailCall(id) in module → direct call + return
source: "crates/core/src/codegen.rs (compile_consolidated_module)"
tags: [consolidation, advanced]
- front: "What metadata does wafer build embed in the .wasm file?"
back: |
ExportMetadata in a "wafer" custom section (JSON):
- version: 1
- entry_table_index: Option<u32>
- host_functions: Vec<(table_index, name)>
- memory_size: u32
- dsp_init, rsp_init, fsp_init: initial stack pointers
source: "crates/core/src/export.rs:21-36"
tags: [export, intermediate]
# ============================================================================
# CATEGORY I: Dictionary Details
# ============================================================================
dictionary_details:
- front: "What is the align4 function?"
back: "(addr + 3) & !3 — rounds up to next 4-byte boundary. Used to align the code field after variable-length name in dictionary entries."
code: |
fn align4(addr: u32) -> u32 {
(addr + 3) & !3
}
source: "crates/core/src/dictionary.rs:51-53"
tags: [dictionary, basic]
- front: "What is the hash index in Dictionary?"
back: |
HashMap<String, Vec<(wid, word_addr, fn_index, is_immediate)>>
Maps uppercase name → list of entries across wordlists.
find() checks search_order against this index (O(1) average).
Fallback: linked-list walk for words not yet in index.
Updated by reveal() and set_immediate().
source: "crates/core/src/dictionary.rs:43"
tags: [dictionary, intermediate]
- front: "What is DictionaryState and when is it used?"
back: |
Snapshot of dictionary state: latest, here, next_fn_index, index (HashMap clone).
Used by MARKER: save_state() captures current state, restore_state() reverts.
Does NOT save the actual memory bytes — just pointers and metadata.
source: "crates/core/src/dictionary.rs:502-509"
tags: [dictionary, intermediate]
- front: "How does Dictionary::create() lay out an entry?"
back: |
Starting at self.here:
1. Write link field (4 bytes): points to previous LATEST
2. Write flags byte (1 byte): HIDDEN | length (optionally | IMMEDIATE)
3. Write name bytes (N bytes, uppercase)
4. Zero-pad to 4-byte alignment
5. Write code field (4 bytes): next_fn_index (auto-incremented)
6. Update latest = entry_start, here = after code field
code: |
// entry_start = self.here
// [link:4][flags:1][name:N][pad:0-3][code:4]
source: "crates/core/src/dictionary.rs:74-124"
tags: [dictionary, intermediate]
# ============================================================================
# CATEGORY J: Optimizer Patterns (hand-crafted supplements to auto-generated)
# ============================================================================
optimizer_extra:
- front: "What are the inline criteria?"
back: |
A Call(id) is inlined if ALL of:
1. Body exists in `bodies` HashMap
2. body.len() <= max_size (8)
3. No self-recursion (contains_call_to check)
4. No Exit (would return from caller)
5. No ForthLocalGet/Set (would collide with caller's locals)
When inlined, TailCall ops are converted back to Call via detailcall().
source: "crates/core/src/optimizer.rs:499-526"
tags: [optimizer, intermediate]
- front: "How does tail_call_detect decide if a tail call is safe?"
back: |
1. IR must be non-empty
2. Return stack must be balanced: count ToR and FromR, depth must be 0
(Unbalanced means >R without matching R>, which would corrupt return stack on tail call)
3. convert_tail_call on last op: Call → TailCall
4. Recurses into If branches: if last op is If, check both then/else branches
code: |
fn is_return_stack_balanced(ops: &[IrOp]) -> bool {
let mut depth: i32 = 0;
for op in ops {
match op {
IrOp::ToR => depth += 1,
IrOp::FromR => depth -= 1,
_ => {}
}
}
depth == 0
}
source: "crates/core/src/optimizer.rs:671-693"
tags: [optimizer, advanced]
- front: "What is the optimizer pass ordering and why does it matter?"
back: |
Phase 1 (simplify): peephole → fold → strength_reduce → peephole
Phase 2 (inline + re-simplify): inline → peephole → fold → strength_reduce → peephole
Phase 3 (eliminate): dce → peephole
Phase 4 (finalize): tail_call_detect
Order matters because:
- Inline before fold: inlined body may have constant expressions
- Fold before strength: folding may produce power-of-2 constants
- DCE after fold: folded constants enable dead-branch elimination
- Tail call last: must operate on final IR
- Peephole between each: cleanup after every transformation
source: "crates/core/src/optimizer.rs:37-85"
tags: [optimizer, advanced]
+687
View File
@@ -0,0 +1,687 @@
#!/usr/bin/env python3
"""
WAFER Anki Deck Generator
Generates an Anki .apkg deck from:
1. Hand-crafted cards in anki_data.yaml
2. Auto-parsed IrOp variants from ir.rs
3. Auto-parsed memory constants from memory.rs
4. Auto-parsed error variants from error.rs
5. Auto-extracted peephole patterns from optimizer.rs
Usage:
pip install genanki pyyaml
python tools/anki_gen.py
Output: tools/wafer_anki.apkg
"""
import hashlib
import re
import sys
from pathlib import Path
try:
import genanki
import yaml
except ImportError:
print("Required: pip install genanki pyyaml")
sys.exit(1)
# ---------------------------------------------------------------------------
# Paths
# ---------------------------------------------------------------------------
TOOLS_DIR = Path(__file__).parent
PROJECT_ROOT = TOOLS_DIR.parent
CORE_SRC = PROJECT_ROOT / "crates" / "core" / "src"
BOOT_FTH = PROJECT_ROOT / "crates" / "core" / "boot.fth"
YAML_FILE = TOOLS_DIR / "anki_data.yaml"
OUTPUT_FILE = TOOLS_DIR / "wafer_anki.apkg"
# ---------------------------------------------------------------------------
# Stable IDs (genanki needs deterministic model/deck IDs)
# ---------------------------------------------------------------------------
def stable_id(name: str) -> int:
"""Generate a stable integer ID from a name."""
h = hashlib.md5(name.encode()).hexdigest()
return int(h[:8], 16)
DECK_ID = stable_id("wafer-learning-deck")
MODEL_ID = stable_id("wafer-card-model")
# ---------------------------------------------------------------------------
# Anki model with code styling
# ---------------------------------------------------------------------------
CSS = """\
.card {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
font-size: 16px;
line-height: 1.5;
color: #1a1a1a;
background: #fafafa;
padding: 20px;
max-width: 700px;
margin: 0 auto;
}
.card.nightMode {
color: #e0e0e0;
background: #1e1e1e;
}
.front { font-size: 18px; font-weight: 600; }
pre, code {
font-family: 'SF Mono', 'Fira Code', 'Cascadia Code', monospace;
font-size: 14px;
background: #f0f0f0;
border-radius: 4px;
}
.nightMode pre, .nightMode code {
background: #2d2d2d;
}
pre {
padding: 12px;
overflow-x: auto;
border: 1px solid #ddd;
}
.nightMode pre { border-color: #444; }
code { padding: 2px 5px; }
.source {
margin-top: 12px;
font-size: 12px;
color: #888;
font-style: italic;
}
.tags-line {
margin-top: 8px;
font-size: 11px;
color: #aaa;
}
.tags-line span {
background: #e8e8e8;
padding: 1px 6px;
border-radius: 3px;
margin-right: 4px;
}
.nightMode .tags-line span { background: #3a3a3a; }
"""
FRONT_TEMPLATE = """\
<div class="front">{{Front}}</div>
"""
BACK_TEMPLATE = """\
<div class="front">{{Front}}</div>
<hr>
<div class="back">{{Back}}</div>
{{#Code}}
<pre>{{Code}}</pre>
{{/Code}}
{{#Source}}
<div class="source">{{Source}}</div>
{{/Source}}
"""
wafer_model = genanki.Model(
MODEL_ID,
"WAFER Card",
fields=[
{"name": "Front"},
{"name": "Back"},
{"name": "Code"},
{"name": "Source"},
],
templates=[
{
"name": "Card 1",
"qfmt": FRONT_TEMPLATE,
"afmt": BACK_TEMPLATE,
},
],
css=CSS,
)
# ---------------------------------------------------------------------------
# Card generation helpers
# ---------------------------------------------------------------------------
def make_note(front: str, back: str, code: str = "", source: str = "", tags: list | None = None) -> genanki.Note:
"""Create a genanki Note with stable GUID."""
guid = genanki.guid_for(front)
note = genanki.Note(
model=wafer_model,
fields=[front, back, code, source],
tags=tags or [],
guid=guid,
)
return note
def html_escape(text: str) -> str:
"""Minimal HTML escaping for card content."""
return text.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
def format_back(text: str) -> str:
"""Convert back text (may have newlines) to HTML."""
lines = text.strip().split("\n")
return "<br>".join(html_escape(line) for line in lines)
def format_code(text: str) -> str:
"""Format code block content."""
return html_escape(text.strip())
# ---------------------------------------------------------------------------
# YAML card loader
# ---------------------------------------------------------------------------
def load_yaml_cards(deck: genanki.Deck) -> int:
"""Load hand-crafted cards from anki_data.yaml."""
if not YAML_FILE.exists():
print(f"Warning: {YAML_FILE} not found, skipping hand-crafted cards")
return 0
with open(YAML_FILE) as f:
data = yaml.safe_load(f)
count = 0
for category, cards in data.items():
if not isinstance(cards, list):
continue
for card in cards:
front = card.get("front", "")
back = card.get("back", "")
code = card.get("code", "")
source = card.get("source", "")
tags = card.get("tags", [])
# Add category as tag
all_tags = [f"wafer::{category}"] + [f"wafer::{t}" for t in tags]
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(code) if code else "",
source=html_escape(source),
tags=all_tags,
)
deck.add_note(note)
count += 1
return count
# ---------------------------------------------------------------------------
# Auto-parse IrOp variants from ir.rs
# ---------------------------------------------------------------------------
def parse_ir_ops(deck: genanki.Deck) -> int:
"""Parse IrOp enum from ir.rs and generate cards."""
ir_file = CORE_SRC / "ir.rs"
if not ir_file.exists():
return 0
content = ir_file.read_text()
count = 0
# Match doc comments + variant lines
# Pattern: /// comment\n VariantName or VariantName(type) or VariantName { ... }
lines = content.split("\n")
i = 0
current_category = ""
while i < len(lines):
line = lines[i].strip()
# Track categories from // -- Category -- comments
cat_match = re.match(r"//\s*--\s*(.+?)\s*--", line)
if cat_match:
current_category = cat_match.group(1).strip()
i += 1
continue
# Collect doc comments
doc_lines = []
while i < len(lines) and lines[i].strip().startswith("///"):
doc_lines.append(lines[i].strip().lstrip("/ ").strip())
i += 1
if i >= len(lines):
break
line = lines[i].strip()
# Match variant definition
variant_match = re.match(
r"^((?:[A-Z][a-zA-Z0-9]+)(?:\([^)]*\))?)\s*[,{]", line
)
if variant_match and doc_lines:
variant = variant_match.group(1)
# Clean up: remove trailing comma
variant = variant.rstrip(",")
doc = " ".join(doc_lines)
# Extract stack effect if present: ( ... -- ... )
stack_match = re.search(r"\(\s*(.+?)\s*\)", doc)
stack_effect = stack_match.group(0) if stack_match else ""
front = f"IrOp::{variant} — what does it do?"
back_parts = [doc]
if stack_effect:
back_parts.insert(0, f"Stack: {stack_effect}")
back = "\n".join(back_parts)
tags = ["wafer::ir", f"wafer::ir_{current_category.lower().replace(' ', '_')}"]
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(f"IrOp::{variant}"),
source=f"crates/core/src/ir.rs",
tags=tags,
)
deck.add_note(note)
count += 1
i += 1
return count
# ---------------------------------------------------------------------------
# Auto-parse memory constants from memory.rs
# ---------------------------------------------------------------------------
def parse_memory_constants(deck: genanki.Deck) -> int:
"""Parse constants from memory.rs and generate cards."""
mem_file = CORE_SRC / "memory.rs"
if not mem_file.exists():
return 0
content = mem_file.read_text()
count = 0
# Match: /// doc comment\n pub const NAME: type = value;
lines = content.split("\n")
i = 0
while i < len(lines):
# Collect doc comments
doc_lines = []
while i < len(lines) and lines[i].strip().startswith("///"):
doc_lines.append(lines[i].strip().lstrip("/ ").strip())
i += 1
if i >= len(lines):
break
line = lines[i].strip()
const_match = re.match(
r"pub const (\w+):\s*\w+\s*=\s*(.+?);", line
)
if const_match and doc_lines:
name = const_match.group(1)
value_expr = const_match.group(2).strip()
doc = " ".join(doc_lines)
# Try to evaluate simple expressions for the card
# (won't work for all, but catches most)
front = f"memory.rs: What is {name}?"
back = f"{doc}\nValue: {value_expr}"
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(f"pub const {name}: u32 = {value_expr};"),
source="crates/core/src/memory.rs",
tags=["wafer::memory", "wafer::constants"],
)
deck.add_note(note)
count += 1
# Also generate reverse card for address-based constants
if name.endswith("_BASE") or name.endswith("_TOP"):
# Try to find hex value
try:
val = eval(value_expr.replace("SYSVAR_BASE + ", "0 + ").replace("SYSVAR_BASE", "0"))
except Exception:
val = None
if isinstance(val, int):
rev_front = f"memory.rs: What region starts at 0x{val:04X}?"
rev_back = f"{name}: {doc}"
rev_note = make_note(
front=html_escape(rev_front),
back=format_back(rev_back),
source="crates/core/src/memory.rs",
tags=["wafer::memory", "wafer::constants", "wafer::reverse"],
)
deck.add_note(rev_note)
count += 1
i += 1
return count
# ---------------------------------------------------------------------------
# Auto-parse error variants from error.rs
# ---------------------------------------------------------------------------
def parse_errors(deck: genanki.Deck) -> int:
"""Parse WaferError enum from error.rs and generate cards."""
err_file = CORE_SRC / "error.rs"
if not err_file.exists():
return 0
content = err_file.read_text()
count = 0
# Match #[error("...")] followed by variant
pattern = re.compile(r'#\[error\("(.+?)"\)\]\s*\n\s*(\w+)(?:\((.+?)\))?', re.MULTILINE)
for m in pattern.finditer(content):
msg = m.group(1)
variant = m.group(2)
inner = m.group(3) or ""
front = f"WaferError::{variant} — when is this error raised?"
back = f'Error message: "{msg}"'
if inner:
back += f"\nContains: {inner}"
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(f"WaferError::{variant}"),
source="crates/core/src/error.rs",
tags=["wafer::error"],
)
deck.add_note(note)
count += 1
return count
# ---------------------------------------------------------------------------
# Auto-extract peephole patterns from optimizer.rs
# ---------------------------------------------------------------------------
def parse_peephole_patterns(deck: genanki.Deck) -> int:
"""Extract peephole optimization patterns from optimizer.rs."""
opt_file = CORE_SRC / "optimizer.rs"
if not opt_file.exists():
return 0
content = opt_file.read_text()
count = 0
# Match comment + pattern in peephole_one_pass
# Pattern: // Comment\n (IrOp::X, IrOp::Y) => { ... }
lines = content.split("\n")
in_peephole = False
i = 0
while i < len(lines):
line = lines[i].strip()
if "fn peephole_one_pass" in line:
in_peephole = True
elif in_peephole and line.startswith("fn "):
in_peephole = False
if in_peephole:
# Match pattern comments like: // PushI32(n), Drop => remove both
comment_match = re.match(r"//\s*(.+?)\s*=>\s*(.+)", line)
if comment_match:
pattern = comment_match.group(1).strip()
result = comment_match.group(2).strip()
front = f"Peephole: {pattern} → ?"
back = result
note = make_note(
front=html_escape(front),
back=format_back(back),
source="crates/core/src/optimizer.rs",
tags=["wafer::optimizer", "wafer::peephole"],
)
deck.add_note(note)
count += 1
i += 1
return count
# ---------------------------------------------------------------------------
# Auto-generate primitive registration cards
# ---------------------------------------------------------------------------
def parse_primitives(deck: genanki.Deck) -> int:
"""Extract IR primitive registrations from outer.rs."""
outer_file = CORE_SRC / "outer.rs"
if not outer_file.exists():
return 0
content = outer_file.read_text()
count = 0
# Match: self.register_primitive("NAME", false, vec![IrOp::X, IrOp::Y])?;
pattern = re.compile(
r'self\.register_primitive\("(.+?)",\s*(true|false),\s*vec!\[(.+?)\]\)',
re.DOTALL,
)
for m in pattern.finditer(content):
name = m.group(1)
immediate = m.group(2) == "true"
ir_body = m.group(3).strip()
# Clean up multiline
ir_body = " ".join(ir_body.split())
front = f"Forth word {name} — what is its IR body?"
back = f"IR: [{ir_body}]"
if immediate:
back += "\n(IMMEDIATE word)"
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(f'register_primitive("{name}", {immediate}, vec![{ir_body}])'),
source="crates/core/src/outer.rs",
tags=["wafer::primitives", "wafer::ir"],
)
deck.add_note(note)
count += 1
return count
# ---------------------------------------------------------------------------
# Auto-generate boot.fth definition cards
# ---------------------------------------------------------------------------
def parse_boot_fth(deck: genanki.Deck) -> int:
"""Extract Forth definitions from boot.fth."""
if not BOOT_FTH.exists():
return 0
content = BOOT_FTH.read_text()
count = 0
lines = content.split("\n")
i = 0
current_comment = ""
while i < len(lines):
line = lines[i]
stripped = line.strip()
# Track section comments
if stripped.startswith("\\") and not stripped.startswith("\\ -------"):
comment = stripped.lstrip("\\ ").strip()
if comment:
current_comment = comment
# Match colon definitions
if stripped.startswith(": "):
# Collect full definition (may span multiple lines)
defn = stripped
while not defn.rstrip().endswith(";") and i + 1 < len(lines):
i += 1
defn += " " + lines[i].strip()
# Extract name
name_match = re.match(r":\s+(\S+)", defn)
if name_match:
name = name_match.group(1)
front = f"boot.fth: How is {name} defined?"
back = current_comment if current_comment else f"Forth definition of {name}"
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(defn),
source="crates/core/boot.fth",
tags=["wafer::boot_fth"],
)
deck.add_note(note)
count += 1
i += 1
return count
# ---------------------------------------------------------------------------
# Auto-parse Runtime trait methods from runtime.rs
# ---------------------------------------------------------------------------
def parse_runtime_trait(deck: genanki.Deck) -> int:
"""Parse Runtime and HostAccess trait methods from runtime.rs."""
rt_file = CORE_SRC / "runtime.rs"
if not rt_file.exists():
return 0
content = rt_file.read_text()
count = 0
# Match trait method signatures with doc comments
lines = content.split("\n")
i = 0
current_trait = ""
while i < len(lines):
line = lines[i].strip()
# Track which trait we're in
trait_match = re.match(r"(?:pub\s+)?trait (\w+)", line)
if trait_match:
current_trait = trait_match.group(1)
i += 1
continue
# Collect doc comments
doc_lines = []
while i < len(lines) and lines[i].strip().startswith("///"):
doc_lines.append(lines[i].strip().lstrip("/ ").strip())
i += 1
if i >= len(lines):
break
line = lines[i].strip()
# Check if this is a trait definition (may follow doc comments)
trait_match = re.match(r"(?:pub\s+)?trait (\w+)", line)
if trait_match:
current_trait = trait_match.group(1)
i += 1
continue
# Match fn signatures
fn_match = re.match(r"fn (\w+)\(", line)
if fn_match and doc_lines and current_trait:
fn_name = fn_match.group(1)
doc = " ".join(doc_lines)
front = f"{current_trait}::{fn_name}() — what does it do?"
back = doc
note = make_note(
front=html_escape(front),
back=format_back(back),
code=format_code(line.rstrip(";")),
source="crates/core/src/runtime.rs",
tags=["wafer::runtime", f"wafer::{current_trait.lower()}"],
)
deck.add_note(note)
count += 1
i += 1
return count
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def main() -> None:
"""Generate the Anki deck."""
deck = genanki.Deck(DECK_ID, "WAFER - WebAssembly Forth Engine in Rust")
print("Generating WAFER Anki deck...")
print()
# Load hand-crafted cards
n = load_yaml_cards(deck)
print(f" Hand-crafted cards (YAML): {n}")
# Auto-generate from source
n = parse_ir_ops(deck)
print(f" IrOp variant cards: {n}")
n = parse_memory_constants(deck)
print(f" Memory constant cards: {n}")
n = parse_errors(deck)
print(f" Error variant cards: {n}")
n = parse_peephole_patterns(deck)
print(f" Peephole pattern cards: {n}")
n = parse_primitives(deck)
print(f" Primitive registration cards: {n}")
n = parse_boot_fth(deck)
print(f" boot.fth definition cards: {n}")
n = parse_runtime_trait(deck)
print(f" Runtime trait method cards: {n}")
total = len(deck.notes)
print(f"\n TOTAL: {total} cards")
# Write .apkg
genanki.Package(deck).write_to_file(str(OUTPUT_FILE))
print(f"\nWrote {OUTPUT_FILE}")
print(f"Import into Anki: File > Import > select {OUTPUT_FILE.name}")
if __name__ == "__main__":
main()
+306
View File
@@ -0,0 +1,306 @@
WAFER Architecture Reference (updated 2026-04-13)
===================================================
1. COMPILATION PIPELINE
-----------------------
Forth Source
|
v
Outer Interpreter (outer.rs)
+--------------------------------------------+
| Tokenizer: whitespace-delimited words |
| For each token: |
| 1. Dictionary lookup (find) |
| 2. If found + interpret mode: EXECUTE |
| 3. If found + compile mode: |
| - Immediate? Execute now |
| - Normal? Append Call(WordId) to IR |
| 4. Not found: try parse as number |
| - Interpret: push to data stack |
| - Compile: append PushI32(n) to IR |
| 5. Neither: error "unknown word" |
+--------------------------------------------+
| On `;` (end of colon definition):
v
Optimizer (optimizer.rs)
+--------------------------------------------+
| Phase 1: Simplify |
| Peephole -> Constant Fold -> |
| Strength Reduce -> Peephole |
| Phase 2: Inline then re-simplify |
| Inline(max=8) -> Peephole -> |
| Constant Fold -> Strength Reduce -> |
| Peephole |
| Phase 3: Eliminate dead code |
| DCE -> Peephole |
| Phase 4: Tail calls (must be last) |
| Tail Call Detect |
+--------------------------------------------+
|
v
Codegen (codegen.rs)
+--------------------------------------------+
| IR -> WASM bytecode via wasm-encoder |
| Each word = one WASM module with: |
| Imports: emit, memory, dsp, rsp, fsp, |
| table |
| Types: void () -> (), i32 (i32) -> () |
| One defined function (the word body) |
| DSP cached in local 0, writeback before |
| calls, reload after calls |
| Scratch locals start at index 1 |
+--------------------------------------------+
|
v
Runtime trait (runtime.rs)
+--------------------------------------------+
| ForthVM<R: Runtime> — generic over backend |
| Runtime provides: |
| - Memory r/w (mem_read_i32, etc.) |
| - Globals (get/set_dsp, rsp, fsp) |
| - Table (ensure_table_size) |
| - instantiate_and_install(wasm_bytes) |
| - call_func(fn_index) |
| - register_host_func(fn_index, HostFn) |
| |
| HostAccess trait — memory/global ops for |
| host function callbacks |
| HostFn = Box<dyn Fn(&mut dyn HostAccess)> |
+--------------------------------------------+
| |
v v
NativeRuntime WebRuntime
(runtime_native.rs) (crates/web/runtime_web.rs)
+------------------+ +------------------+
| wasmtime Engine | | js_sys::WebAsm |
| Store, Memory | | Memory, Table |
| Table, Globals | | Global objects |
| Func closures | | JS Closures |
+------------------+ +------------------+
2. MEMORY LAYOUT (Linear Memory)
--------------------------------
Address Region Size Notes
-------- ------------------ ------- -------------------------
0x0000 System Variables 64 B STATE, BASE, >IN, HERE,
LATEST, SOURCE-ID, #TIB,
HLD, LEAVE-FLAG
0x0040 Input Buffer 1024 B Source parsing
0x0440 PAD 256 B Scratch area
0x0540 Pictured Output 128 B <# ... #> (grows down)
0x05C0 WORD Buffer 64 B Transient counted string
0x0600 Data Stack 4096 B 1024 cells, grows DOWN
0x1600 (Data Stack Top) DSP starts here
0x1540 Return Stack 4096 B Grows DOWN
0x2540 Float Stack 2048 B 256 doubles, grows DOWN
0x2D40 Dictionary grows UP Linked list of word entries
Total initial memory: 16 pages = 1 MiB (max 256 pages = 16 MiB)
Cell size: 4 bytes (i32)
Float size: 8 bytes (f64)
3. SYSTEM VARIABLES (offsets from 0x0000)
-----------------------------------------
Offset Name Purpose
------ ---------- -----------------------------------
0 STATE 0=interpreting, -1=compiling
4 BASE Number base (default 10)
8 >IN Parse offset into input buffer
12 HERE Next free dictionary address
16 LATEST Most recent dictionary entry addr
20 SOURCE-ID 0=user input, -1=string
24 #TIB Length of current input
28 HLD Pictured numeric output pointer
32 LEAVE-FLAG Nonzero when LEAVE called in loop
4. DICTIONARY ENTRY FORMAT
--------------------------
+--------+-------+----------+---------+-----------+
| Link | Flags | Name | Padding | Code |
| 4 bytes| 1 byte| N bytes | 0-3 B | 4 bytes |
+--------+-------+----------+---------+-----------+
^ ^
entry_addr code field (fn table index)
Flags byte:
Bit 7 (0x80): IMMEDIATE
Bit 6 (0x40): HIDDEN (during compilation)
Bits 0-4 (0x1F): name length (max 31)
Link points to previous entry (0 = end of list).
Name stored uppercase, padded to 4-byte alignment.
Code field: index into WASM function table.
Parameter field (if any) follows immediately after code field.
5. THREE TYPES OF WORDS
-----------------------
a) IR Primitives (compiled to WASM)
register_primitive("DUP", false, vec![IrOp::Dup])
- Body stored as Vec<IrOp>
- Optimized, then compiled to WASM module
- Inlineable by optimizer
- FAST: no function call overhead when inlined
b) Host Functions (HostFn closures)
register_host_primitive(".", false, func)
- HostFn = Box<dyn Fn(&mut dyn HostAccess) -> Result<()>>
- Access memory/globals via HostAccess trait (runtime-agnostic)
- NOT inlineable
- Used for: I/O, dictionary manipulation, complex logic
- Same closure works on NativeRuntime and WebRuntime
c) Forth-defined words
: SQUARE DUP * ;
- Compiled by outer interpreter
- Goes through full optimize -> codegen pipeline
- Stored in ir_bodies for future inlining
6. WASM MODULE STRUCTURE (per word)
-----------------------------------
Imports (6) — provided by Runtime impl:
0. emit (func: i32 -> void) Character output callback
1. memory (memory: 16 pages) Shared linear memory
2. dsp (global: mut i32) Data stack pointer
3. rsp (global: mut i32) Return stack pointer
4. fsp (global: mut i32) Float stack pointer
5. table (table: funcref) Shared function table
Types (2):
0. void: () -> ()
1. i32: (i32) -> ()
Functions (1):
The compiled word body
Element section:
table[base_fn_index] = function 1
Runtime::instantiate_and_install(wasm_bytes, fn_index):
- NativeRuntime: Module::new + Instance::new with 6 wasmtime imports
- WebRuntime: WebAssembly.instantiate with JS import objects
7. OPTIMIZATION PASSES (detail)
-------------------------------
PEEPHOLE (runs 5x across full pipeline):
PushI32(n), Drop -> (removed) Unused literal
Dup, Drop -> (removed) Redundant copy
Swap, Swap -> (removed) Self-inverse
Swap, Drop -> Nip Combine
PushI32(0), Add -> (removed) Identity
PushI32(0), Or -> (removed) Identity
PushI32(-1), And -> (removed) Identity
PushI32(1), Mul -> (removed) Identity
Over, Over -> TwoDup Combine
Drop, Drop -> TwoDrop Combine
(+ float variants: PushF64/FDrop, FDup/FDrop, FSwap/FSwap, FNegate/FNegate)
CONSTANT FOLD:
Binary: PushI32(a), PushI32(b), <op> -> PushI32(result)
Supports: Add, Sub, Mul, And, Or, Xor, Lshift, Rshift, ArithRshift,
Eq, NotEq, Lt, Gt, LtUnsigned
Unary: PushI32(n), <op> -> PushI32(result)
Supports: Negate, Abs, Invert, ZeroEq, ZeroLt
Float binary: PushF64(a), PushF64(b), <op> -> PushF64(result)
Float unary: PushF64(n), <op> -> PushF64(result)
STRENGTH REDUCE:
PushI32(2^n), Mul -> PushI32(n), Lshift
PushI32(0), Eq -> ZeroEq
PushI32(0), Lt -> ZeroLt
DCE:
PushI32(nonzero), If{then,else} -> then_body only
PushI32(0), If{then,else} -> else_body only
Everything after Exit -> removed
INLINE (max_size=8, single pass):
Call(id) -> inline body if:
- Body length <= 8 ops
- No self-recursion
- No Exit (would return from caller)
- No ForthLocalGet/Set (would collide with caller's locals)
TailCall -> Call when inlined (no longer tail position)
TAIL CALL (last pass):
Last Call(id) -> TailCall(id) if:
- Return stack balanced (equal ToR and FromR)
Recurses into If branches for conditional tail calls
8. CONSOLIDATION
----------------
CONSOLIDATE word recompiles all JIT-compiled words into a
single WASM module:
- All call_indirect -> direct call (for words in module)
- External calls (host functions) remain call_indirect
- Maximum performance for final program
Two-part implementation:
codegen::compile_consolidated_module() - builds multi-function module
outer::ForthVM::consolidate() - orchestrates collection + table update
9. EXPORT PIPELINE (wafer build)
--------------------------------
1. Evaluate source file with recording_toplevel=true
2. Collect all IR words + top-level IR
3. Determine entry: --entry flag > MAIN word > top-level execution
4. Build consolidated module with data section (memory snapshot)
5. Embed metadata in "wafer" custom section (JSON)
6. Optional: --js generates JS loader + HTML page
7. Optional: --native AOT-compiles and appends to wafer binary
Format: [wafer binary][precompiled WASM][metadata][trailer]
Trailer: payload_len(8) + metadata_len(8) + "WAFEREXE"(8)
10. CRATE STRUCTURE
-------------------
crates/
core/ wafer-core: compiler, optimizer, codegen, dictionary, Runtime trait
Feature flags: default=["native"], "native" enables wasmtime
Without features: pure Rust (dictionary, IR, optimizer, codegen, outer)
cli/ wafer: CLI REPL (rustyline), wafer build/run commands
web/ wafer-web: browser REPL (wasm-bindgen + WebRuntime + HTML/CSS/JS)
Key web files:
crates/web/src/lib.rs WaferRepl wasm-bindgen entry point
crates/web/src/runtime_web.rs WebRuntime: js_sys WebAssembly API
crates/web/www/app.js Frontend JS (terminal emulation)
crates/web/www/index.html HTML shell
crates/web/www/style.css Styling
11. BOOT SEQUENCE
-----------------
ForthVM::<R>::new() ->
1. R::new() — create runtime (wasmtime or browser WASM)
2. register_primitives() in batch_mode:
- ~40 IR primitives (DUP, +, @, etc.)
- ~60 host functions (., .S, M*, ACCEPT, etc.)
- ~30 special words (IF, DO, :, VARIABLE, etc.)
3. compile_batch() - single WASM module for all IR primitives
4. Load boot.fth - Forth replaces Rust host functions:
Phase 1: Stack/memory (DEPTH, PICK, 2OVER, FILL, MOVE)
Phase 2: Double-cell arithmetic (D+, DNEGATE, D<)
Phase 3: Mixed arithmetic (SM/REM, FM/MOD, */, */MOD)
Phase 4: HERE, ALLOT, comma, ALIGN
Phase 5: I/O, pictured numeric output (., U., TYPE, <# # #>)
Phase 6: DEFER support
Phase 7: String operations (COMPARE, SOURCE, FALIGNED)
+347
View File
@@ -0,0 +1,347 @@
#!/usr/bin/env python3
"""WAFER IR Flash Quiz — predict the optimized IR for Forth code."""
import random
import sys
# Each exercise: (forth_code, accepted_answers, explanation)
# accepted_answers: list of strings that count as correct (case-insensitive, whitespace-normalized)
EXERCISES = [
# --- Constant Folding ---
(
": FOO 2 3 + ;",
["PushI32(5)", "pushi32(5)", "5"],
"Constant fold: PushI32(2), PushI32(3), Add → PushI32(5)",
),
(
": FOO 10 3 - ;",
["PushI32(7)", "pushi32(7)", "7"],
"Constant fold: PushI32(10), PushI32(3), Sub → PushI32(7)",
),
(
": FOO 6 7 * ;",
["PushI32(42)", "pushi32(42)", "42"],
"Constant fold: PushI32(6), PushI32(7), Mul → PushI32(42)",
),
(
": FOO 5 0= ;",
["PushI32(0)", "pushi32(0)", "0", "false"],
"Constant fold (unary): PushI32(5), ZeroEq → PushI32(0) (5 is not zero)",
),
(
": FOO 0 0= ;",
["PushI32(-1)", "pushi32(-1)", "-1", "true"],
"Constant fold (unary): PushI32(0), ZeroEq → PushI32(-1) (true flag)",
),
(
": FOO -3 ABS ;",
["PushI32(3)", "pushi32(3)", "3"],
"Constant fold (unary): PushI32(-3), Abs → PushI32(3)",
),
(
": FOO 255 INVERT ;",
["PushI32(-256)", "pushi32(-256)", "-256"],
"Constant fold (unary): PushI32(255), Invert → PushI32(-256) (bitwise NOT)",
),
(
": FOO 3 2 LSHIFT ;",
["PushI32(12)", "pushi32(12)", "12"],
"Constant fold: PushI32(3), PushI32(2), Lshift → PushI32(12) (3 << 2 = 12)",
),
# --- Peephole ---
(
": FOO DUP DROP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: Dup, Drop → removed (both eliminated)",
),
(
": FOO SWAP SWAP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: Swap, Swap → removed (self-inverse)",
),
(
": FOO SWAP DROP ;",
["Nip", "nip"],
"Peephole: Swap, Drop → Nip",
),
(
": FOO DROP DROP ;",
["TwoDrop", "twodrop", "2drop"],
"Peephole: Drop, Drop → TwoDrop",
),
(
": FOO OVER OVER ;",
["TwoDup", "twodup", "2dup"],
"Peephole: Over, Over → TwoDup",
),
(
": FOO 0 + ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: PushI32(0), Add → removed (identity)",
),
(
": FOO 1 * ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: PushI32(1), Mul → removed (identity)",
),
(
": FOO -1 AND ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: PushI32(-1), And → removed (identity, all bits set)",
),
(
": FOO 0 OR ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: PushI32(0), Or → removed (identity)",
),
(
": FOO 42 DROP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: PushI32(42), Drop → removed (unused literal)",
),
# --- Strength Reduction ---
(
": FOO 8 * ;",
["PushI32(3), Lshift", "pushi32(3) lshift", "3 lshift"],
"Strength reduce: PushI32(8) is 2^3, Mul → PushI32(3), Lshift",
),
(
": FOO 16 * ;",
["PushI32(4), Lshift", "pushi32(4) lshift", "4 lshift"],
"Strength reduce: PushI32(16) is 2^4, Mul → PushI32(4), Lshift",
),
(
": FOO 2 * ;",
["PushI32(1), Lshift", "pushi32(1) lshift", "1 lshift"],
"Strength reduce: PushI32(2) is 2^1, Mul → PushI32(1), Lshift",
),
(
": FOO 0 = ;",
["ZeroEq", "zeroeq", "0="],
"Strength reduce: PushI32(0), Eq → ZeroEq",
),
(
": FOO 0 < ;",
["ZeroLt", "zerolt", "0<"],
"Strength reduce: PushI32(0), Lt → ZeroLt",
),
# --- Dead Code Elimination ---
(
": FOO TRUE IF 42 ELSE 99 THEN ;",
["PushI32(42)", "pushi32(42)", "42"],
"DCE: PushI32(-1) is nonzero → then_body only → PushI32(42)",
),
(
": FOO FALSE IF 42 ELSE 99 THEN ;",
["PushI32(99)", "pushi32(99)", "99"],
"DCE: PushI32(0) is zero → else_body only → PushI32(99)",
),
(
": FOO EXIT 42 ;",
["Exit", "exit"],
"DCE: Everything after Exit is removed. PushI32(42) eliminated.",
),
# --- Combined Optimizations ---
(
": FOO DUP * ;",
["Dup, Mul", "dup mul", "dup, mul"],
"Inline DUP and *: [Dup, Mul]. No further optimizations apply.",
),
(
": FOO 2 3 + 4 * ;",
["PushI32(20)", "pushi32(20)", "20"],
"Fold 2+3=5, then fold 5*4=20. Single constant.",
),
(
": FOO 1 2 + 8 * ;",
["PushI32(24)", "pushi32(24)", "24"],
"Fold 1+2=3, strength reduce 8*? No — fold first: 3*8=24.",
),
(
": FOO 0 0 + ;",
["PushI32(0)", "pushi32(0)", "0"],
"Fold: PushI32(0), PushI32(0), Add → PushI32(0)",
),
(
": FOO SWAP DUP DROP SWAP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole chain: Swap,Dup → ...; Dup,Drop → removed; Swap,Swap → removed. All gone.",
),
# --- Inlining ---
(
": SQUARE DUP * ;\n: FOO SQUARE ;",
["Dup, Mul", "dup mul", "dup, mul"],
"SQUARE body=[Dup,Mul] (2 ops ≤ 8). Inlined into FOO. Tail call: Dup is not Call, Mul is not Call → no tail call.",
),
# --- Tail Call ---
(
": BAR 1 ; : FOO 42 BAR ;",
["PushI32(42), TailCall(bar_id)", "pushi32(42) tailcall", "42 tailcall(bar)"],
"BAR has body [PushI32(1)] — 1 op, inlineable. But wait: if BAR is inlined, result is [PushI32(42), PushI32(1)]. Actually depends on whether BAR is inlined. If NOT inlined: tail call applies to Call(bar). If inlined: [PushI32(42), PushI32(1)].",
),
# --- Float ---
(
": FOO 1.0E0 2.0E0 F+ ;",
["PushF64(3.0)", "pushf64(3.0)", "3.0"],
"Float constant fold: PushF64(1.0), PushF64(2.0), FAdd → PushF64(3.0)",
),
(
": FOO -5.0E0 FABS ;",
["PushF64(5.0)", "pushf64(5.0)", "5.0"],
"Float unary fold: PushF64(-5.0), FAbs → PushF64(5.0)",
),
(
": FOO FNEGATE FNEGATE ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: FNegate, FNegate → removed (self-inverse)",
),
(
": FOO FSWAP FSWAP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: FSwap, FSwap → removed (self-inverse)",
),
(
": FOO FDUP FDROP ;",
["(empty)", "empty", "nothing", "[]", ""],
"Peephole: FDup, FDrop → removed",
),
# --- Tricky ---
(
": FOO 3 5 < ;",
["PushI32(-1)", "pushi32(-1)", "-1", "true"],
"Constant fold: PushI32(3), PushI32(5), Lt → PushI32(-1) (3 < 5 is true, Forth true = -1)",
),
(
": FOO 5 3 < ;",
["PushI32(0)", "pushi32(0)", "0", "false"],
"Constant fold: PushI32(5), PushI32(3), Lt → PushI32(0) (5 < 3 is false)",
),
(
": FOO DUP DUP DROP DROP ;",
["Dup", "dup"],
"Peephole: Dup, Dup, Drop, Drop → Dup (first Dup stays, second Dup+Drop cancel, last Drop+implicit cancel... actually: Dup, Dup → keep; Dup, Drop → cancel; left with Dup. Then Drop. Hmm. Let's trace: [Dup, Dup, Drop, Drop] → peephole sees Dup,Drop at positions 1,2 → removes → [Dup, Drop] → peephole sees Dup,Drop → removes → []. Actually empty!",
),
]
def normalize(s: str) -> str:
"""Normalize answer for comparison: lowercase, strip whitespace/punctuation."""
s = s.strip().lower()
# Remove parentheses, brackets, commas for flexible matching
for ch in "()[]":
s = s.replace(ch, "")
# Collapse whitespace
s = " ".join(s.split())
return s
def check_answer(user_input: str, accepted: list[str]) -> bool:
"""Check if user's answer matches any accepted answer."""
norm_input = normalize(user_input)
for ans in accepted:
if normalize(ans) == norm_input:
return True
return False
def run_quiz(exercises: list, shuffle: bool = True) -> None:
"""Run the interactive quiz."""
items = list(exercises)
if shuffle:
random.shuffle(items)
correct = 0
total = 0
skipped = 0
print("=" * 60)
print(" WAFER IR Flash Quiz")
print(" Predict the optimized IR for each Forth definition.")
print(" Type 'q' to quit, 's' to skip, 'h' for hint.")
print("=" * 60)
print()
for i, (forth, accepted, explanation) in enumerate(items):
total += 1
print(f" [{i + 1}/{len(items)}]")
print(f" {forth}")
print()
while True:
try:
user = input(" Your answer> ").strip()
except (EOFError, KeyboardInterrupt):
print("\n")
show_score(correct, total - 1, skipped)
return
if user.lower() == "q":
show_score(correct, total - 1, skipped)
return
if user.lower() == "s":
skipped += 1
print(f" Skipped. Answer: {accepted[0]}")
print(f" {explanation}")
break
if user.lower() == "h":
# Give a hint: first word of explanation
hint_word = explanation.split(":")[0] if ":" in explanation else "Think about the optimizer passes"
print(f" Hint: {hint_word}")
continue
if check_answer(user, accepted):
correct += 1
print(f" \033[32m✓ Correct!\033[0m {explanation}")
else:
print(f" \033[31m✗ Not quite.\033[0m Expected: {accepted[0]}")
print(f" {explanation}")
break
print()
print("-" * 60)
print()
show_score(correct, total, skipped)
def show_score(correct: int, total: int, skipped: int) -> None:
"""Display final score."""
attempted = total - skipped
if attempted == 0:
print(" No questions attempted.")
return
pct = (correct / attempted) * 100
print(f"\n Score: {correct}/{attempted} ({pct:.0f}%)")
if skipped:
print(f" Skipped: {skipped}")
if pct == 100:
print(" Perfect! You know the optimizer cold.")
elif pct >= 80:
print(" Strong! Review the ones you missed.")
elif pct >= 60:
print(" Getting there. Focus on peephole + fold patterns.")
else:
print(" Study tools/architecture.txt section 7, then retry.")
print()
def main() -> None:
"""Entry point."""
if "--all" in sys.argv:
run_quiz(EXERCISES, shuffle=False)
elif "--count" in sys.argv:
print(f"{len(EXERCISES)} exercises available.")
else:
run_quiz(EXERCISES, shuffle=True)
if __name__ == "__main__":
main()
+189
View File
@@ -0,0 +1,189 @@
# WAFER Codebase Reading Order
Optimal sequence for learning the entire system. Each step builds on the previous.
---
## Phase 1: Mental Model Foundation
### 1. `crates/core/src/memory.rs` (148 lines)
**Read first.** Defines the physical memory map — every address, every region. You'll reference these constants everywhere else.
- Key insight: stacks grow DOWN, dictionary grows UP
- Memorize: DATA_STACK_TOP=0x1600, DICTIONARY_BASE=0x2D40
- System variables at offset 0: STATE, BASE, >IN, HERE, LATEST, SOURCE-ID, #TIB, HLD, LEAVE-FLAG
- Notice how regions are laid out to never overlap (verified by compile-time assertions)
### 2. `crates/core/src/ir.rs` (259 lines)
**The central data structure.** Every Forth word compiles to `Vec<IrOp>`. This is the language the optimizer speaks and the codegen consumes.
- ~70 variants across 10 categories
- Pay attention to control-flow variants: `If`, `DoLoop`, `BeginUntil`, `BeginWhileRepeat`, `BeginDoubleWhileRepeat` — they contain nested `Vec<IrOp>` bodies (tree structure, not flat)
- `Call(WordId)` and `TailCall(WordId)` — how words reference each other
- Float ops are separate from integer ops (separate stack)
- `IrWord` struct: name + body + is_immediate
### 3. `crates/core/src/error.rs` (84 lines)
**Quick read.** 15 error variants. Note `Throw(i32)` for the Exception word set and `Abort(String)` for ABORT".
### 4. `crates/core/src/config.rs` (61 lines)
**Quick read.** 7 optimization flags in two tiers: OptConfig (IR-level) and CodegenOpts (codegen-level). Default = all enabled.
---
## Phase 2: Data Structures
### 5. `crates/core/src/dictionary.rs` (906 lines)
**How words live in memory.** The dictionary is a linked list stored in a `Vec<u8>` that simulates WASM linear memory.
- Entry format: link(4) + flags(1) + name(N) + padding + code_field(4)
- Flags byte: IMMEDIATE=0x80, HIDDEN=0x40, LENGTH_MASK=0x1F
- `create()` writes the entry, starts HIDDEN; `reveal()` removes HIDDEN flag
- `find()`: fast path via HashMap index, fallback via linked-list walk
- Wordlist support: `current_wid`, `search_order`, `find_in_wid()`
- `DictionaryState` for MARKER save/restore
- Read every test — they document exact behavior
---
## Phase 3: The Pipeline
### 6. `crates/core/src/optimizer.rs` (1013 lines)
**IR transformations.** Read the `optimize()` function first to see the pass ordering, then each pass.
- `peephole()`: pattern-match adjacent ops. ~15 patterns. Runs to fixpoint. Study each match arm.
- `constant_fold()`: evaluate PushI32+PushI32+BinaryOp at compile time. Also unary and float.
- `strength_reduce()`: multiply by power-of-2 → shift. 0 compare → ZeroEq/ZeroLt.
- `dce()`: eliminate dead branches (constant condition), truncate after Exit.
- `inline()`: replace Call(id) with body if ≤8 ops, non-recursive, no Exit, no ForthLocals. `detailcall()` converts TailCall back to Call.
- `tail_call_detect()`: last Call → TailCall if return stack balanced. Recurses into If branches.
- Key: `apply_to_bodies()` — every pass recurses into control-flow nested bodies.
### 7. `crates/core/src/codegen.rs` (4205 lines) — **The Big One**
**IR → WASM translation.** Read in order:
1. **Constants** (lines 1-80): import indices, type indices, DSP/RSP/FSP globals, memory alignment
2. **Helper functions** (lines 80-210): `dsp_dec/inc`, `push_via_local`, `pop`, `peek`, `dsp_writeback/reload`, `rpush_via_local`, `rpop`
3. **Float helpers** (lines 225-330): `fsp_dec/inc`, `fpush_via_local`, `fpop`, `fpeek`, `emit_float_binary/unary/cmp`
4. **`emit_op()`** (line 344+): the giant match — each IrOp variant → WASM instructions. This is the heart.
5. **`compile_word()`**: builds the WASM module structure (imports, types, functions, element section)
6. **`compile_consolidated_module()`**: multi-function module for CONSOLIDATE/export
7. **Stack-to-local promotion**: analysis pass that replaces memory stack operations with WASM locals
Key patterns to understand:
- DSP cached in local 0: read from global at function entry, write back before calls and at exit
- Scratch locals at SCRATCH_BASE(1): used as temporaries for stack manipulation
- `EmitCtx`: carries f64 locals, Forth local base, loop local base, self_word_id for recursion
- DO/LOOP: index+limit in WASM locals when possible (fast path), fallback to return stack
---
## Phase 4: The Runtime Abstraction
### 8. `crates/core/src/runtime.rs` (152 lines)
**NEW: Read this before outer.rs.** Defines two traits:
- `Runtime` — abstraction over WASM execution backend (memory, globals, table, module instantiation, host function registration)
- `HostAccess` — memory/global ops available to host function callbacks
- `HostFn = Box<dyn Fn(&mut dyn HostAccess) -> Result<()>>` — runtime-agnostic host function type
- Key insight: ForthVM is now `ForthVM<R: Runtime>`, completely decoupled from wasmtime
### 8b. `crates/core/src/runtime_native.rs` (328 lines)
**NativeRuntime**: wasmtime implementation of Runtime trait.
- `CallerHostAccess` wraps wasmtime `Caller` to implement `HostAccess`
- `NativeRuntime` owns Engine, Store, Memory, Table, Globals
- `register_host_func`: creates a wasmtime `Func` that bridges `HostFn` → wasmtime callback
- Study how `instantiate_and_install` provides the 6 imports
### 9. `crates/core/src/outer.rs` — ForthVM struct (lines 1-240)
**Read the struct definition carefully.** ~35 fields. Group them mentally:
- Runtime: `rt: R` (generic over Runtime trait — no more direct wasmtime fields)
- Compilation state: state, compiling_name, compiling_ir, control_stack, compiling_word_id, compiling_locals
- Output: output (Arc<Mutex<String>>)
- Dictionary bridge: dictionary, user_here, here_cell, base_cell
- Word metadata: ir_bodies, host_word_names, word_pfa_map, does_definitions
- Shared state for host functions: pending_define, pending_actions, pending_does_patch, throw_code, word_lookup
- Configuration: config, batch_mode, deferred_ir
- Export support: toplevel_ir, recording_toplevel
- Advanced: marker_states, conditional_skip_depth, next_block_label, substitutions, search_order, next_wid
### 10. `crates/core/src/outer.rs` — new() and primitive registration
**How the VM boots.** Read:
- `new_with_config()`: creates `R::new()` runtime, then calls `register_primitives()` and loads boot.fth
- `register_primitive()`: creates dictionary entry → optimizes IR → compiles to WASM → `rt.instantiate_and_install()`
- `register_host_primitive()`: creates dictionary entry → `rt.register_host_func()` with HostFn closure
- `register_primitives()`: ~130 words registered in batch_mode, then `compile_batch()`
- Each host function: study 5-10 representative ones to understand the pattern
### 11. `crates/core/src/outer.rs` — Outer interpreter loop
**The main loop.** Read:
- `evaluate()`: sets up input buffer, calls `interpret_token()` in a loop
- `interpret_token()`: conditional compilation, `:` handling, `]` handling, dispatch to compile/interpret mode
- `interpret_token_immediate()`: string literals, dictionary lookup, execute found word, parse number
- `compile_token()`: POSTPONE, string literals, control-flow words (IF/ELSE/THEN/DO/LOOP/BEGIN/WHILE/REPEAT/AGAIN/UNTIL/CASE/OF/ENDOF/ENDCASE), dictionary lookup, compile Call(id), parse number → PushI32
- `finish_colon_def()`: optimize → codegen → install
### 12. `crates/core/src/outer.rs` — Control flow compilation
**Most complex part.** 13 `ControlEntry` variants. Understand:
- `ControlEntry::If { then_body }` → pushed when IF seen, then_body accumulates until ELSE or THEN
- `ControlEntry::Do { body }` → pushed by DO, body accumulates until LOOP/+LOOP
- `ControlEntry::Begin { body }` → pushed by BEGIN, resolved by UNTIL/AGAIN/WHILE
- `ControlEntry::BeginWhile { test, body }` → WHILE splits Begin into test + body
- `ControlEntry::Case/Of` → CASE/OF/ENDOF/ENDCASE pattern
- `ControlEntry::QDo` → ?DO (conditional entry)
- `ControlEntry::Ahead` → AHEAD (unconditional forward branch)
- CS-PICK and CS-ROLL: advanced control-flow manipulation for tools word set
---
## Phase 5: Self-Hosting
### 13. `crates/core/boot.fth` (307 lines)
**Forth replaces Rust.** 7 phases of definitions that replace host functions with compiled Forth.
- Phase 1: Stack/memory (DEPTH, PICK, 2OVER, FILL, MOVE, /STRING, -TRAILING)
- Phase 2: Double-cell arithmetic (D+, DNEGATE, D-, DABS, D0=, D0<, D=, D<, DU<)
- Phase 3: Mixed arithmetic (SM/REM, FM/MOD, */, */MOD) — built on M* and UM/MOD host primitives
- Phase 4: HERE, ALLOT, comma, C-comma, ALIGN — magic numbers for sysvar offsets
- Phase 5: I/O and pictured numeric output (TYPE, SPACES, <# HOLD # #S #> . U. .R U.R D. D.R)
- Phase 6: DEFER support (DEFER!, DEFER@)
- Phase 7: String operations, SOURCE, FALIGNED, etc.
- Key insight: why Forth not Rust? Self-hosting goal + compiled Forth with direct calls beats host function dispatch
---
## Phase 6: Production Features
### 14. `crates/core/src/consolidate.rs` (169 lines)
**Quick read.** Mostly tests. Real logic is in `codegen::compile_consolidated_module()` and `outer::ForthVM::consolidate()`. Understand the concept: merge all JIT modules into one, replacing call_indirect with direct call.
### 15. `crates/core/src/export.rs` (409 lines)
**wafer build pipeline.** Entry point resolution (--entry > MAIN > top-level), IR collection, memory snapshot, metadata embedding in custom section.
### 16. `crates/core/src/runner.rs` (402 lines)
**Standalone execution.** Creates the 6 imports from scratch, registers host function stubs for known words (., TYPE, SPACES, .S, M*, UM*, UM/MOD, DEPTH). Shows the minimal set needed to run exported modules.
### 17. `crates/cli/src/main.rs` (354 lines)
**CLI ties it together.** Three modes: REPL (rustyline), file evaluation, subcommands (build, run). Native executable trick: append AOT payload + "WAFEREXE" trailer to binary.
### 18. `crates/web/src/lib.rs` (56 lines)
**Browser entry point.** `WaferRepl` struct with `#[wasm_bindgen]`:
- `new()``ForthVM::<WebRuntime>::new()`
- `evaluate(input)` → returns output string
- `data_stack()`, `is_compiling()`, `reset()`
### 19. `crates/web/src/runtime_web.rs` (542 lines)
**WebRuntime**: browser implementation of Runtime trait.
- Uses `js_sys::WebAssembly` for module instantiation
- `WebHostAccess`: implements HostAccess via `js_sys` typed arrays
- Memory access through `Int32Array`/`Uint8Array` views on `WebAssembly.Memory.buffer`
- Closures kept alive via `_closures: Vec<JsValue>` to prevent GC
### 20. `crates/web/www/` (727 lines)
**Frontend**: app.js (terminal emulation, stack display), index.html, style.css.
---
## Phase 7: Testing
### 21. Unit tests (embedded in each source file)
Re-read each file's `#[cfg(test)] mod tests`. They document edge cases and expected behavior.
### 22. `crates/core/tests/compliance.rs`
Forth 2012 compliance infrastructure: boot_with_prerequisites, run_suite, 11 word set tests.
### 23. `crates/core/tests/comparison.rs`
Cross-engine benchmarks vs gforth. Performance validation.
+464
View File
@@ -0,0 +1,464 @@
# WAFER Trace-the-Compilation Exercises
For each exercise, manually trace the Forth code through the full pipeline:
1. **Outer interpreter** — tokenization, dictionary lookup, compile/interpret dispatch
2. **IR generation** — what Vec<IrOp> is produced
3. **Optimization** — which passes fire, what changes
4. **Codegen** — WASM instructions emitted (conceptual)
5. **Runtime** — how it executes
Answers are below each exercise (scroll down or cover with paper).
---
## Exercise 1: Simple Arithmetic
```forth
: SQUARE DUP * ;
```
<details>
<summary>Answer</summary>
1. `:` → enter compile mode, next token "SQUARE" = word name, dictionary.create("SQUARE")
2. `DUP` → find in dictionary → IR primitive (WordId N) → append `Call(dup_id)`
3. `*` → find → IR primitive → append `Call(mul_id)`
4. `;` → raw IR: `[Call(dup_id), Call(mul_id)]`
5. **Optimize:**
- Inline: DUP body=[Dup] (1 op ≤ 8), * body=[Mul] (1 op ≤ 8) → `[Dup, Mul]`
- Peephole: no patterns match Dup,Mul
- Constant fold: nothing to fold
- Tail call: Mul is not a Call → skip
- **Final IR: `[Dup, Mul]`**
6. **Codegen:**
- Dup: `local.get $dsp; i32.load; local.set $tmp; dsp_dec; local.get $dsp; local.get $tmp; i32.store`
- Mul: `pop; pop; i32.mul; push_via_local`
7. **Runtime:** WASM module instantiated, function registered at table[word_id]
</details>
---
## Exercise 2: Constant Folding
```forth
: TEN 5 5 + ;
```
<details>
<summary>Answer</summary>
1. `:` → compile mode, name="TEN"
2. `5` → not in dictionary → parse as number → append `PushI32(5)`
3. `5` → append `PushI32(5)`
4. `+` → find → IR primitive → append `Call(add_id)`
5. `;` → raw IR: `[PushI32(5), PushI32(5), Call(add_id)]`
6. **Optimize:**
- Inline: + body=[Add] → `[PushI32(5), PushI32(5), Add]`
- Constant fold: PushI32(5), PushI32(5), Add → `PushI32(10)`
- **Final IR: `[PushI32(10)]`**
7. **Codegen:** Just `push_const(f, 10)``dsp_dec; local.get $dsp; i32.const 10; i32.store`
</details>
---
## Exercise 3: Peephole Elimination
```forth
: NOOP DUP DROP ;
```
<details>
<summary>Answer</summary>
1. Raw IR after inlining: `[Dup, Drop]`
2. **Optimize:**
- Peephole: Dup, Drop → removed (both eliminated)
- **Final IR: `[]` (empty)**
3. **Codegen:** Empty function body — just DSP writeback at entry/exit
</details>
---
## Exercise 4: Strength Reduction
```forth
: DOUBLE 8 * ;
```
<details>
<summary>Answer</summary>
1. Raw IR after inlining: `[PushI32(8), Mul]`
2. **Optimize:**
- Strength reduce: PushI32(8) is 2^3, so → `[PushI32(3), Lshift]`
- 8 * x becomes x << 3
- **Final IR: `[PushI32(3), Lshift]`**
3. **Codegen:** push_const(3), then pop two, i32.shl, push result
</details>
---
## Exercise 5: Tail Call Detection
```forth
: FOO 1 + BAR ;
```
(Assume BAR is a previously defined word)
<details>
<summary>Answer</summary>
1. Raw IR: `[PushI32(1), Call(add_id), Call(bar_id)]`
2. **Optimize:**
- Inline + (1 op): `[PushI32(1), Add, Call(bar_id)]`
- Tail call: last op is Call(bar_id), return stack balanced (no >R or R>) → `TailCall(bar_id)`
- **Final IR: `[PushI32(1), Add, TailCall(bar_id)]`**
3. **Codegen:** TailCall emits `dsp_writeback; call_indirect bar_id; return`
</details>
---
## Exercise 6: Control Flow — IF/THEN
```forth
: ABS DUP 0< IF NEGATE THEN ;
```
<details>
<summary>Answer</summary>
1. `DUP` → Call(dup_id), `0<` → Call(zerolt_id)
2. `IF` → push ControlEntry::If { then_body: [] }, start collecting
3. `NEGATE` → Call(negate_id) appended to then_body
4. `THEN` → pop ControlEntry::If, emit `If { then_body: [Call(negate_id)], else_body: None }`
5. Raw IR: `[Call(dup_id), Call(zerolt_id), If { then: [Call(negate_id)], else: None }]`
6. **Optimize:**
- Inline all (each is 1 op): `[Dup, ZeroLt, If { then: [Negate], else: None }]`
- Note: optimizer recurses into If bodies via apply_to_bodies
- **Final IR: `[Dup, ZeroLt, If { then: [Negate], else: None }]`**
7. **Codegen:** pop flag → `if (block) ... end` WASM structure
</details>
---
## Exercise 7: DO LOOP
```forth
: STARS 0 DO 42 EMIT LOOP ;
```
<details>
<summary>Answer</summary>
1. `0` → PushI32(0)
2. `DO` → push ControlEntry::Do { body: [] }
3. `42` → PushI32(42) into body
4. `EMIT` → Call(emit_id) into body
5. `LOOP` → pop Do, emit `DoLoop { body: [PushI32(42), Call(emit_id)], is_plus_loop: false }`
6. Note: the 0 and the limit (already on stack from caller) are consumed by DoLoop
7. **Optimize:**
- Inline EMIT (1 op): `DoLoop { body: [PushI32(42), Emit], is_plus_loop: false }`
- **Final IR:** `[PushI32(0), DoLoop { body: [PushI32(42), Emit], is_plus_loop: false }]`
8. **Codegen:** Loop index+limit in WASM locals. WASM `loop { body; index++; br_if index<limit }`
</details>
---
## Exercise 8: BEGIN UNTIL
```forth
: COUNTDOWN BEGIN DUP . 1 - DUP 0= UNTIL DROP ;
```
<details>
<summary>Answer</summary>
1. `BEGIN` → push ControlEntry::Begin { body: [] }
2. `DUP .` → Call(dup_id), Call(dot_id) into body
3. `1 -` → PushI32(1), Call(sub_id) into body
4. `DUP 0=` → Call(dup_id), Call(zeroeq_id) into body
5. `UNTIL` → pop Begin, emit `BeginUntil { body: [Call(dup), Call(dot), PushI32(1), Call(sub), Call(dup), Call(zeroeq)] }`
6. **Optimize:** Inline small primitives. `1 -` stays as `PushI32(1), Sub` (no further fold since operand unknown). `.` is a host function → NOT inlined.
7. `DROP` after loop.
</details>
---
## Exercise 9: Dead Code Elimination
```forth
: ALWAYS-TRUE TRUE IF 42 ELSE 99 THEN ;
```
<details>
<summary>Answer</summary>
1. Raw IR after inlining TRUE (body=[PushI32(-1)]):
`[PushI32(-1), If { then: [PushI32(42)], else: Some([PushI32(99)]) }]`
2. **DCE:** PushI32(-1) is nonzero → emit then_body only
`[PushI32(42)]`
3. Entire IF/ELSE/THEN eliminated. Just pushes 42.
</details>
---
## Exercise 10: Swap Peephole Patterns
```forth
: TEST SWAP SWAP DROP DROP ;
```
<details>
<summary>Answer</summary>
1. After inlining: `[Swap, Swap, Drop, Drop]`
2. **Peephole pass 1:**
- Swap, Swap → removed → `[Drop, Drop]`
- Drop, Drop → TwoDrop → `[TwoDrop]`
3. **Final IR: `[TwoDrop]`**
</details>
---
## Exercise 11: Nested Control Flow
```forth
: CLASSIFY DUP 0< IF DROP -1 ELSE 0> IF 1 ELSE 0 THEN THEN ;
```
<details>
<summary>Answer</summary>
1. IR structure (after inlining):
```
[Dup, ZeroLt, If {
then: [Drop, PushI32(-1)],
else: Some([Gt(implicit 0>), If {
then: [PushI32(1)],
else: Some([PushI32(0)])
}])
}]
```
2. Optimizer recurses into both If bodies. No constant conditions → no DCE.
3. Codegen: nested WASM `if/else/end` blocks.
</details>
---
## Exercise 12: DOES> Defining Word
```forth
: CONSTANT CREATE , DOES> @ ;
5 CONSTANT FIVE
FIVE .
```
<details>
<summary>Answer</summary>
1. `: CONSTANT` enters compile mode
2. `CREATE` — flagged as saw_create_in_def=true
3. `,` — compiled normally
4. `DOES>` — splits definition:
- create_ir = everything before DOES> (the `,` call)
- does_action = everything after DOES> (the `@` call) → compiled as separate word
- Stores DoesDefinition { create_ir, does_action_id, has_create: true }
5. `5 CONSTANT FIVE`:
- CONSTANT executes its defining behavior
- CREATE makes dictionary entry "FIVE"
- `,` stores 5 at FIVE's parameter field
- DOES> patches FIVE to execute the does_action (which does `@`)
6. `FIVE .`:
- FIVE executes: pushes its PFA, then calls does_action (`@`)
- `@` fetches the 5 stored there
- `.` prints "5 "
</details>
---
## Exercise 13: Consolidation
```forth
: A 1 ;
: B 2 ;
: C A B + ;
CONSOLIDATE
```
<details>
<summary>Answer</summary>
1. Before CONSOLIDATE: A, B, C are separate WASM modules. C calls A and B via `call_indirect` through the function table.
2. CONSOLIDATE:
- Collects all IR bodies: A=[PushI32(1)], B=[PushI32(2)], C=[Call(a_id), Call(b_id), Add(inlined)]
- Builds local_fn_map: A→1, B→2, C→3 (within consolidated module)
- `compile_consolidated_module()`: all three become functions in one WASM module
- C's Call(a_id) → direct `call 1` (not call_indirect)
- Replaces all table entries with new functions
3. Result: C calling A and B is now a direct WASM `call` — much faster than table dispatch.
</details>
---
## Exercise 14: Host Function Execution
```forth
5 3 M*
```
<details>
<summary>Answer</summary>
1. `5` → push to data stack (dsp -= 4, mem[dsp] = 5)
2. `3` → push to data stack (dsp -= 4, mem[dsp] = 3)
3. `M*` → host function (Rust closure):
- Read sp = dsp global value
- Read n2 = mem[sp] = 3 (as i64)
- Read n1 = mem[sp+4] = 5 (as i64)
- result = 5i64 * 3i64 = 15i64
- lo = 15 as i32 = 15
- hi = (15 >> 32) as i32 = 0
- Write mem[sp+4] = 15 (lo), mem[sp] = 0 (hi)
- Stack unchanged (still 2 cells, now containing double-cell 15)
4. Note: M* is a host function because it needs 64-bit multiplication (WASM i32 only)
</details>
---
## Exercise 15: Float Operations
```forth
: HYPOTENUSE FDUP F* FSWAP FDUP F* F+ FSQRT ;
```
<details>
<summary>Answer</summary>
1. After inlining: `[FDup, FMul, FSwap, FDup, FMul, FAdd, FSqrt]`
2. **Peephole:** No matching patterns (FDup+FMul not a known pair)
3. **Codegen:** All float ops use the float stack (FSP global):
- FDup: `fpeek(f)` then `fpush_via_local`
- FMul: `emit_float_binary` with `f64.mul`
- FSqrt: `emit_float_unary` with `f64.sqrt`
4. Float stack lives at 0x2540-0x2D40 in linear memory
</details>
---
## Exercise 16: BEGIN WHILE REPEAT
```forth
: COUNTDOWN BEGIN DUP WHILE DUP . 1 - REPEAT DROP ;
```
<details>
<summary>Answer</summary>
1. `BEGIN` → ControlEntry::Begin { body: [] }
2. `DUP` → Call(dup_id) into body
3. `WHILE` → pop Begin, create ControlEntry::BeginWhile { test: [Call(dup_id)], body: [] }
4. `DUP . 1 -` → into body
5. `REPEAT` → pop BeginWhile, emit `BeginWhileRepeat { test: [Dup], body: [Dup, Call(dot_id), PushI32(1), Sub] }`
6. Semantics: evaluate test; if false exit loop; execute body; jump to BEGIN
</details>
---
## Exercise 17: Batch Mode Compilation
```forth
( During ForthVM::new() )
```
<details>
<summary>Answer</summary>
1. `register_primitives()` sets `batch_mode = true`
2. Each `register_primitive("DUP", ...)`:
- Creates dictionary entry (dictionary.create + reveal)
- Stores IR body in ir_bodies
- Pushes `(word_id, ir_body)` to `deferred_ir` (no WASM compilation yet)
3. After all ~40 IR primitives registered:
- `compile_batch()` compiles ALL deferred IR into a single WASM module
- One `rt.instantiate_and_install()` call — single module with ~40 functions
- Each function registered in the table
4. Why batch? Amortizes runtime compilation overhead. One module instead of 40.
5. Host functions bypass batch_mode — registered via `rt.register_host_func()` with HostFn closures.
</details>
---
## Exercise 18: wafer build Pipeline
```forth
( file: hello.fth )
: MAIN ." Hello, World!" CR ;
```
```bash
wafer build hello.fth -o hello.wasm
```
<details>
<summary>Answer</summary>
1. `cmd_build()`: create ForthVM, set recording=true, evaluate source
2. `evaluate()`: compiles MAIN normally (IR → optimize → codegen)
3. `recording_toplevel=true`: but MAIN is a definition, not top-level execution, so toplevel_ir stays empty
4. `export_module()`:
- Collect IR words: MAIN + all boot.fth definitions
- Entry point: no --entry flag, look for MAIN → found!
- Build `local_fn_map`: all words get module-internal indices
- `compile_exportable_module()`: single WASM module with all functions
- Data section: snapshot of linear memory (dictionary, variables, etc.)
- Metadata in "wafer" custom section: version, entry index, host functions, memory size, stack pointers
5. Output: hello.wasm file
</details>
---
## Exercise 19: Stack-to-Local Promotion
```forth
: ADD3 + + ;
```
<details>
<summary>Answer</summary>
1. After inlining: `[Add, Add]`
2. **Stack-to-local promotion** (codegen pass, not optimizer):
- Analyzes stack flow: first Add pops 2, pushes 1; second Add pops 2 (including that 1), pushes 1
- If stack depth is statically known at each point → can use WASM locals instead of memory stack
- Result: operands stay in WASM locals/operand stack, no memory reads/writes
- Much faster: avoids load/store through linear memory
3. Promotion only works for "straight-line" code (no calls that might modify the stack unpredictably)
</details>
---
## Exercise 20: MARKER and State Restore
```forth
MARKER CLEAN
: FOO 1 ;
: BAR 2 ;
CLEAN
FOO \ Error: unknown word
```
<details>
<summary>Answer</summary>
1. `MARKER CLEAN`:
- Creates a MarkerState snapshot: dictionary state, user_here, next_table_index, word_pfa_map, ir_bodies, does_definitions, host_word_names, two_value_words, fvalue_words
- Registers CLEAN as a word that, when executed, restores this snapshot
2. `: FOO 1 ; : BAR 2 ;` — normal compilation, adds to dictionary
3. `CLEAN`:
- Executes the marker word
- Restores dictionary to state before FOO/BAR were defined
- Resets user_here, ir_bodies, etc.
- FOO and BAR are gone — dictionary.find("FOO") returns None
4. `FOO` → "unknown word: FOO"
Key: MARKER doesn't undo WASM table entries (they become unreachable but stay allocated). It restores the dictionary and Rust-side metadata.
</details>
Binary file not shown.