Move to boot.fth: D+, D-, DNEGATE, DABS, D0=, D0<, D=, D<, D2*, D2/,
DMAX, DMIN, M+, DU<.
D+ uses proper carry detection via unsigned comparison after low-cell
addition. All other double-cell words build on D+ and standard Forth
stack operations.
Removed 544 lines of Rust closures. Cumulative: ~1,091 Rust lines removed
across Phases 1-2, replaced by ~80 lines of Forth. All 425 tests pass.
Create boot.fth loaded at startup after IR primitives are compiled.
Forth-compiled WASM with direct calls outperforms host function dispatch
(no call_indirect overhead, Cranelift can inline across word boundaries).
Words moved to Forth: 2OVER, 2ROT, WITHIN, 2@, 2!, FILL, CMOVE, CMOVE>,
MOVE, ERASE, BLANK, /STRING, -TRAILING.
Removed 547 lines of Rust closures, replaced by 48 lines of Forth.
All 425 tests pass.
Add `wafer build --native` to produce self-contained native executables.
The approach appends AOT-precompiled WASM and metadata to a copy of the
wafer binary itself, requiring no Rust toolchain at build time.
On startup, the binary checks for an appended payload (8-byte "WAFEREXE"
magic trailer). If found, it deserializes the precompiled module and runs
it directly, skipping CLI argument parsing entirely.
Uses wasmtime's Engine::precompile_module() for AOT compilation at build
time and Module::deserialize() at runtime — instant startup with no JIT.
Binary layout: [wafer binary][precompiled wasm][metadata json][trailer]
Trailer: payload_len(u64 LE) + metadata_len(u64 LE) + "WAFEREXE"
Also refactored runner.rs: extracted shared run_module() to avoid
duplication between run_wasm_bytes() and run_precompiled_bytes().
Made serialize_metadata() public for CLI use.
Add `wafer build` to compile Forth source files to standalone .wasm modules,
and `wafer run` to execute them. The same .wasm file works with both the
wafer runtime (via wasmtime) and in browsers (via generated JS loader).
New CLI subcommands:
- `wafer build file.fth -o file.wasm` — compile to standalone WASM
- `wafer build file.fth -o file.wasm --js` — also generate JS/HTML loader
- `wafer build file.fth --entry WORD` — custom entry point
- `wafer run file.wasm` — execute pre-compiled module
Entry point resolution: --entry flag > MAIN word > recorded top-level execution.
Memory snapshot embedded as WASM data section preserves VARIABLE/CONSTANT state.
Metadata in custom "wafer" section enables the runner to provide host functions.
New modules: export.rs (orchestration), runner.rs (wasmtime host), js_loader.rs
(browser support). Refactored codegen.rs to share logic between consolidation
and export via compile_multi_word_module(). Added ir_bodies tracking for
VARIABLE, CONSTANT, CREATE, VALUE, DEFER, BUFFER:, MARKER, 2CONSTANT,
2VARIABLE, 2VALUE, FVARIABLE defining words.
Removed dead code: dot_func field, unused wafer-web stub crate, wasmtime-wasi
dependency from CLI, orphaned --consolidate/--output CLI flags.
425 tests pass (414 original + 11 new including 7 round-trip integration tests).
All were planning artifacts never imported or loaded:
- forth/ (4 .fth files): commented-out TODO stubs, never loaded at startup
- crates/core/src/words/mod.rs: empty module with commented-out submodules
- compiler.rs: placeholder, all compiler logic lives in outer.rs
- primitives.rs: placeholder, all primitives registered in outer.rs
- types.rs: StackType/StackEffect defined but never imported anywhere
Batch-compile all ~64 IR primitives into a single WASM module at startup.
Replaces 64 separate Module::new + Instance::new with 1 of each.
Reuses compile_consolidated_module() directly, removed compile_core_module() stub.
Boot time: 7.7ms -> 0.6ms (release), test suite: 5.1s -> 1.5s (debug).
13 of 14 optimizations now implemented. 392 tests passing.
Stack-to-local promotion (Phase 1: straight-line code):
- Words with no control flow/calls use WASM locals instead of memory stack
- Stack manipulation (Swap, Rot, Nip, Tuck, Dup, Drop) emits ZERO instructions
- ~7x instruction reduction for arithmetic-heavy words like DUP *
- Pre-loads consumed items from memory, writes results back at exit
Consolidation recompiler (CONSOLIDATE word):
- Recompiles all IR-based words into single WASM module
- Direct call instructions instead of call_indirect through function table
- Cranelift can inline and optimize across word boundaries
- All control flow variants support consolidated calls
342 unit tests + 11 compliance, all passing.
Inlining: store IR bodies for all words, inline Call(id) when body <= 8 ops
and non-recursive. Convert TailCall back to Call when inlining (tail position
in callee is not tail position in caller -- found via compliance test failure
where inlined TailCall caused unreachable code after the call site).
DSP global caching: cache $dsp in WASM local 0 at function entry, use
local.get/set throughout, writeback before calls and at function exit.
Reduces global access instructions by ~30-40%.
323 unit tests + 11 compliance, all passing.
Replace placeholder compliance tests with real harness that boots WAFER,
loads Gerry Jackson's test suite, and asserts 0 errors per word set.
Passing word sets (11/13):
Core, Core Plus, Core Ext, Exception, Double-Number, String,
Search-Order, Memory-Allocation, Programming-Tools, Facility, Locals
Not yet: File-Access (needs WASI), Floating-Point, Extended-Character
272 total tests (261 unit + 11 compliance)
- Add docs/OPTIMIZATIONS.md: catalog of 14 optimization passes with
status tracking and implementation roadmap
- Configure workspace-level clippy and rustc lints in Cargo.toml
- Add clippy.toml and deny.toml for clippy thresholds and dependency
auditing (licenses, advisories, bans)
- Set up pre-commit hook: cargo fmt, dprint, clippy, cargo deny,
cargo machete
- Update Justfile with deny/machete targets, dprint in fmt checks
Core word set: 0 errors on Gerry Jackson's forth2012-test-suite/core.fr
- Fix POSTPONE for non-immediate words via COMPILE, mechanism
- Fix double-DOES> (WEIRD: pattern) with does-body scanning and
runtime patching via _DOES_PATCH_
- Implement CATCH/THROW exception handling using wasmtime trap
mechanism with stack pointer save/restore
- 232 tests passing
- Fix HERE corruption: sync user_here before writing to shared cell
- Fix DOES> without CREATE: patch most-recent word, not read new name
- Implement >BODY via word_pfa_map tracking parameter field addresses
- Nested BEGIN...WHILE...WHILE...REPEAT...ELSE...THEN support
- DEPTH overflow protection
- Forth 2012 core.fr: 3 errors remaining (POSTPONE edge case,
double-DOES>, NOP meta-programming)
Major compliance fixes for running Gerry Jackson's core.fr tests:
- >IN synchronization: outer interpreter reads >IN back from WASM memory
after each word, enabling TESTING and other >IN-manipulating words
- RSHIFT changed to logical (unsigned) shift per Forth 2012 spec
- +LOOP uses boundary-crossing termination check for negative steps
- HEX/DECIMAL compile as WASM primitives (work inside definitions)
- BASE read from WASM memory for all number formatting
- Pictured numeric output: <# # #S #> HOLD SIGN
- New words: 2@ 2! .( ] ArithRshift
- Error recovery resets compile state on failure
- FIND reads counted strings from WASM memory
- Forth 2012 core.fr: 58 errors remaining (from unable-to-load)
- Dictionary: linked-list word headers in simulated linear memory with
create/find/reveal, case-insensitive lookup, IMMEDIATE flag support
- WASM codegen: IR-to-WASM translation via wasm-encoder with full
validation; all stack, arithmetic, comparison, logic, memory, control
flow, and return stack operations; wasmtime execution tests
- Outer interpreter: tokenizer, number parsing (decimal/$hex/#dec/%bin),
interpret/compile dispatch, control structures (IF/ELSE/THEN,
BEGIN/UNTIL, BEGIN/WHILE/REPEAT), RECURSE, comments, string output
- 40+ primitive words registered via JIT-compiled WASM modules linked
to shared memory/globals/table
- Interactive REPL with rustyline, piped input, and file execution
- 145 tests passing across dictionary, codegen, and runtime
Optimizing Forth 2012 compiler targeting WebAssembly with IR-based
compilation pipeline, multi-typed stack inference, subroutine threading,
and JIT/consolidation modes. Rust kernel with ~35 primitives and Forth
standard library for core/core-ext word sets.